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Preface 


For some time now, topology has been firmly established as one of 
the basic disciplines of pure mathematics. Its ideas and methods have 
transformed large parts of geometry and analysis almost beyond recogni- 
tion. It has also greatly stimulated the growth of abstract algebra. As 
things stand today, much of modern pure mathematics must remain a 
closed book to the person who does not acquire a working knowledge of at 
least the elements of topology. 

There are many domains in the broad field of topology, of which the 
following are only a few: the homology and cohomology theory of com- 
plexes, and of more general spaces as well; dimension theory; the theory 
of differentiable and Riemannian manifolds and of Lie groups; the theory 
of continuous curves; the theory of Banach and Hilbert spaces and their 
operators, and of Banach algebras; and abstract harmonic analysis on 
locally compact groups. Each of these subjects starts from roughly the 
same body of fundamental knowledge and develops its own methods of 
dealing with its own characteristic problems. The purpose of Part 1 of 
this book is to make available to the student this “hard core” of funda- 
mental topology; specifically, to make it available in a form which is 
general enough to meet the needs of modern mathematics, and yet is 
unburdened by excess baggage best left in the research journals, 

A topological space can be thought of as a set from which has been 
swept away all structure irrelevant to the continuity of functions defined 
on it. Part 1 therefore begins with an informal (but quite extensive) 
treatment of sets and functions. Some writers deal with the theory of 
metric spaces as if it were merely a fragment of the general theory of 
topological spaces. This practice is no doubt logically correct, but it 
seems to me to violate the natural relation between these topics, in which 
metric spaces motivate the more general theory. Metric spaces are 
therefore discussed rather fully in Chapter 2, and topological spaces are 
introduced in Chapter 3. The remaining four chapters in Part 1 are 
concerned with various kinds of topological spaces of special importance 
in applications and with the continuous functions carried by them. 

It goes without saying that one aspect of this type of mathematics 
is its logical precision. Too many writers, however, are content with 
this, and make little effort to help the reader maintain his orientation in 
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the midst of mazes of detail. One of the main features of this book is the 
attention given to motivating the ideas under discussion. On every 
possible occasion I have tried to make clear the intuitive meaning of what 
is taking place, and diagrams are provided, whenever it seems feasible, 
to help the reader develop skill in using his imagination to visualize 
abstract ideas. Also, each chapter begins with a brief introduction 
which describes its main theme in general terms. Courses in topology 
are being taught more and more widely on the undergraduate level in 
our colleges and universities, and I hope that these features, which tend 
to soften the austere framework of definitions, theorems, and proofs, 
will make this book readable and easy to use as a text. 

Historically speaking, topology has followed two principal lines of 
development. In homology theory, dimension theory, and the study of 
manifolds, the basic motivation appears to have come from geometry. 
In these fields, topological spaces are looked upon as generalized geometric 
configurations, and the emphasis is placed on the structure of the spaces 
themselves. In the other direction, the main stimulus has been analysis. 
Continuous functions are the chief objects of interest here, and topological 
spaces are regarded primarily as carriers of such functions and as domains 
over which they can be integrated. These ideas lead naturally into the 
theory of Banach and Hilbert spaces and Banach algebras, the modern 
theory of integration, and abstract harmonic analysis on locally compact 
groups. 

In Part 1 of this book, I have attempted an even balance between 
these two points of view. This part is suitable for a basic semester course, 
and most of the topics treated are indispensable for further study in 
almost any direction. If the instructor wishes to devote a second 
semester to some of the extensions and applications of the theory, many 
possibilities are open. If he prefers applications in modern analysis, he 
can continue with Part 2 of this book, supplemented, perhaps, with a 
brief treatment of measure and integration aimed at the general form of 
the Riesz representation theorem. Or if his tastes incline him toward 
the geometric aspects of topology, he can switch over to one of the many 
excellent books which deal with these matters. 

The instructor who intends to continue with Part 2 must face a 
question which only he can answer. Do his students know enough about 
algebra? This question is forced to the surface by the fact that Chapters 
9 to 11 are as much about algebra as they are about topology and analy- 
sis. If his students know little or nothing about modern algebra, then a 
careful and detailed treatment of Chapter 8 should make it possible to 
proceed without difficulty. And if they know a good deal, then a quick 
survey of Chapter 8 should suffice. It is my own opinion that education 
in abstract mathematics ought to begin on the junior level with a course 
in modern algebra, and that topology should be offered only to students 
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who have acquired some familiarity, through such a course, with abstract 
methods. 

Part 3 is intended for individual study by exceptionally well-qualified 
students with a reasonable knowledge of complex analysis. Its principal 
purpose is to unify Parts 1 and 2 into a single body of thought, along the 
lines mapped out in the last section of Chapter 11. 

Taken as a whole, the present work stands at the threshold of the 
more advanced books by Rickart [34], Loomis [27], and Naimark [32]; 
and much of its subject matter can be found (in one form or another and 
with innumerable applications to analysis) in the encyclopedic treatises 
of Dunford and Schwartz [8] and Hille and Phillips [20].!_ This book is 
intended to be elementary, in the sense of being accessible to well-trained 
undergraduates, while those just mentioned are not. Its prerequisites 
are almost negligible. Several facts about determinants are used without 
proof in Chapter 11, and Chapter 12 leans heavily on Liouville’s theorem 
and the Laurent expansion from complex analysis. With these excep- 
tions, the book is essentially self-contained. 

It seems to me that a worthwhile distinction can be drawn between 
two types of pure mathematics. The first—which unfortunately is 
somewhat out of style at’ present—centers attention on particular func- 
tions and theorems which are rich in meaning and history, like the gamma 
function and the prime number theorem, or on juicy individual facts, like 
Euler’s wonderful formula 


1+Y%tK+--- =. 


The second is concerned primarily with form and structure. The present 
book belongs to this camp; for its dominant theme can be expressed in 
just two words, continuity and linearity, and its purpose is to illuminate 
the meanings of these words and their relations to each other. Mathe- 
matics of this kind hardly ever yields great and memorable results like 
the prime number theorem and Euler’s formula. On the contrary, its 
theorems are generally small parts of a much larger whole and derive 
their main significance from the place they occupy in that whole. In 
my opinion, if a body of mathematics like this is to justify itself, it must 
possess aesthetic qualities akin to those of a good piece of architecture. 
It should have a solid foundation, its walls and beams should be firmly 
and truly placed, each part should bear a meaningful relation to every 
other part, and its towers and pinnacles should exalt the mind. It is my 
hope that this book can contribute to a wider appreciation of these mathe- 
matical values. 


George F. Simmons 


1 The numbers in brackets refer to works listed in the Bibliography. 


A Note to the Reader 


Two matters call for special comment: the problems and the proofs. 

The majority of the problems are corollaries and extensions of 
theorems proved in the text, and are freely drawn upon at all later stages 
of the book. In general, they serve as a bridge between ideas just treated 
and developments yet to come, and the reader is strongly urged to master 
them as he goes along. 

In the earlier chapters, proofs are given in considerable detail, in an 
effort to smooth the way for the beginner. As our subject unfolds 
through the successive chapters and the reader acquires experience in 
following abstract mathematical arguments, the proofs become briefer 
and minor details are more and more left for the reader to fill in for 
himself. The serious student will train himself to look for gaps in proofs, 
and should regard them as tacit invitations to do a little thinking on his 
own. Phrases like “‘it is easy to see,” ‘‘one can easily show,” “evidently,” 
“clearly,” and so on, are always to be taken as warning signals which 
indicate the presence of gaps, and they should put the reader on his 
guard, 

It is a basic principle in the study of mathematics, and one too 
seldom emphasized, that a proof is not really understood until the 
stage is reached at which one can grasp it as a whole and see it as a single 
idea. In achieving this end, much more is necessary than merely follow- 
ing the individual steps in the reasoning. This is only the beginning. 
A proof should be chewed, swallowed, and digested, and this process of 
assimilation should not be abandoned until it yields a full comprehension 
of the overall pattern of thought. 
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Copology 


CHAPTER ONE 


Sets and Functions 


It is sometimes said that mathematics 7s the study of sets and func- 
tions. Naturally, this oversimplifies matters; but it does come as close 
to the truth as an aphorism can. 

The study of sets and functions leads two ways. One path goes 
down, into the abysses of logic, philosophy, and the foundations of 
mathematics. The other goes up, onto the highlands of mathematics 
itself, where these concepts are indispensable in almost all of pure mathe- 
matics as it is today. Needless to say, we follow the latter course. We 
regard sets and functions as tools of thought, and our purpose in this 
chapter is to develop these tools to the point where they are sufficiently 
powerful to serve our needs through the rest of this book. 

As the reader proceeds, he will come to understand that the words 
set and function are not as simple as they may seem. In a sense, they 
are simple; but they are potent words, and the quality of simplicity they 
possess is that which lies on the far side of complexity. They are like 
seeds, which are primitive in appearance but have the capacity for vast 
and intricate development. 


1. SETS AND SET INCLUSION 


We adopt a naive point of view in our discussion of sets and assume 
that the concepts of an element and of a set of elements are intuitively 
clear. By an element we mean an object or entity of some sort, as, for 


example, a positive integer, a point on the real line (= a real number), 
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or 2 point in the complex plane (= a complex number). A set is a collec- 
tion or aggregate of such elements, considered together or as a whole. 
Some examples are furnished by the set of all even positive integers, the 
set of all rational points on the real line, and the set of all points in the 
complex plane whose distance from the origin is 1 (= the unit circle in the 
plane). We reserve the word class to refer to a set of sets. We might 
speak, for instance, of the class of all circles in a plane (thinking of each 
circle as a set of points). It will be useful in the work we do if we carry 
this hierarchy one step further and use the term family for a set of classes. 
One more remark: the words element, set, class, and family are not 
intended to be rigidly fixed in their usage; we use them fluidly, to express 
varying attitudes toward the mathematical objects and systems we 
study. It is entirely reasonable, for instance, to think of a circle not as 
a set of points, but as a single entity in itself, in which case we might 
justifiably speak of the set of all circles in a plane. 

There are two standard notations available for designating a par- 
ticular set. Whenever it is feasible to do so, we can list its elements 
between braces. Thus {1, 2, 3} signifies the set consisting of the first 
three positive integers, {1, 7, —1, —7} is the set of the four fourth roots 
of unity, and {+1, +3, +5, . . .} is the set of all odd integers. This 
manner of specifying a set, by listing its elements, is unworkable in many 
circumstances. We are then obliged to fall back on the second method, 
which is to use a property or attribute that characterizes the elements of 
the set in question. If P denotes a certain property of elements, then 
{2:P} stands for the set of all elements x for which the property P is 
meaningful and true. For example, the expression 


{x:a is real and irrational}, 


which we read the set of all x such that x is real and irrational, denotes the 
set of all real numbers which cannot be written as the quotient of two 
integers. The set under discussion contains all those elements (and no 
others) which possess the stated property. The three sets of numbers 
described at the beginning of this paragraph can be written either way: 


{1, 2, 3} = {n:n is an integer and 0 < n < 4}, 
{1, 2, —1, —7} = {z:z is a complex number and z‘ = 1}, 
and {+1, +3, +5, ...} = {n:nis an odd integer}. 


We often shorten our notation. For instance, the last two sets mentioned 
might perfectly well be written {z:z4 = 1} and {n:n is odd}. Our pur- 
pose is to be clear and to avoid misunderstandings, and if this can be 
achieved with less notation, so much the better. In the same vein we can 
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write 
the unit circle = {z:|z| = 1}, 
the closed unit disc = {z:|z| < 1}, 
and the open unit disc = {z:|z| < 1}. 


We use a special system of notation for designating intervals of various 
kinds on the real line. If @ and 6 are real numbers such that a < b, 
then the following symbols on the left are defined to be the indicated sets 
on the right: 

[a,b] = {2z:a <x <b}, 

(a,b] = {x:a < x < bd}, 

{a,b} = {z:a < x < 5d}, 

(a,b) = {z:a <2 <b}. 


We speak of these as the closed, the open-closed, the closed-open, and 
the open intervals from a to 6. In particular, [0,1] is the closed unit 
interval, and (0,1) is the open unit interval. 

There are certain logical difficulties which arise in the foundations of 
the theory of sets (see Problem 1). We avoid these difficulties by assum- 
ing that each discussion in which a number of sets are involved takes place 
in the context of a single fixed set. This set is called the universal set. 
It is denoted by U in this section and the next, and every set mentioned is 
assumed to consist of elementsin U. In later chapters there will always 
be on hand a given space within which we work, and this will serve 
without further comment as our universal set.!_ It is often convenient 
to have available in U a set containing no elements whatever; we call this 
the empty set and denote it by the symbol 9. A set is said to be finzte 
if it is empty or consists of n elements for some positive integer 7; other- 
wise, it is said to be infinite. 

We usually denote elements by small letters and sets by large letters. 
If x is an element and A is a set, the statement that x is an element of A 
(or belongs to A, or is contained in A) is symbolized by x€ A. We 
denote the negation of this, namely, the statement that z is not an element 
of A, by xg A. 

Two sets A and B are said to be equal if they consist of exactly the 
same elements; we denote this relation by A = B and its negation by 
A xB. Wesay that A is a subset of B (or is contained in B) if each ele- 
ment of A is also an element of B. This relation is symbolized by A C B. 
We sometimes express this by saying that B is a superset of A (or con- 


1 The words set and space are often used in loose contrast to one another. A set is 
merely an amorphous collection of elements, without coherence or form. When some 
kind of algebraic or geometric structure is imposed on a set, so that its elements are 
organized into a systematic whole, then it becomes a space. 
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tains A). A C B allows for the possibility that A and B might be equal. 
If A is a subset of B and is not equal to B, we say that A is a proper subset 
of B (or is properly contained in B). This relation is denoted by A C B. 
We can also express A C B by saying that B is a proper superset of A 
(or properly contains A). The relation C is usually called set inclusion. 

We sometimes reverse the symbols introduced in the previous para- 
graph. Thus A C Band A C B are occasionally written in the equiva- 
lent forms B > A and BD A. 

It will often be convenient to have a symbol for logical implication, 
and = is the symbol we use. If p and gq are statements, then p => q 
means that p implies q, or that if p is true, then qis also true. Similarly, 
— is our symbol for two-way implication or logical equivalence. It 
means that the statement on each side implies the statement on the other, 
and is usually read if and only if, or is equivalent to. 

The main properties of set inclusion are obvious. They are the 
following: 

(1) A CA for every A; 

(2) ACBandBCAS>A=B; 

(3) AC BandBOCCSACC. 
It is quite important to observe that (1) and (2) can be combined into the 
single statement that A= B@ACB and BCA. This remark 
contains a useful principle of proof, namely, that the only way to show 
that two sets are equal, apart from merely inspecting them, is to show 
that each is a subset of the other. 


Problems 


1. Perhaps the most famous of the logical difficulties referred to in the 
text is Russell’s paradox. To explain what this is, we begin by 
observing that a set can easily have elements which are themselves 
sets, e.g., {1, {2,3}, 4}. This raises the possibility that a set might 
well contain itself as one of its elements. We call such a set an 
abnormal set, and any set which does not contain itself as an element 
we call a normal set. Most sets are normal, and if we suspect that 
abnormal sets are in some way undesirable, we might try to confine 
our attention to the set N of all normal sets. Someone is now sure 
to ask, Is N itself normal or abnormal? It is evidently one or the 
other, and it cannot be both. Show that if N is normal, then it 
must be abnormal. Show also that if N is abnormal, then it must 
be normal. We see in this way that each of our two alternatives is 
self-contradictory, and it seems to be the assumption that N exists 
as a set which has brought us to this impasse. For further discussion 
of these matters, we refer the interested reader to Wilder [42, p. 55] 
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or Fraenkel and Bar-Hillel (10, p. 6]. Russell’s own account of the 
discovery of his paradox can be found in Russell [36, p. 75]. 

2. The symbol we have used for set inclusion is similar to that used for 
the familiar order relation on the real line: if z and y are real numbers, 
x <y means that y — x is non-negative. The order relation on 
the real line has all the properties mentioned in the text: 

(1’) x < 2 for every z; 

(2!) xSyandy<r r= y; 

(3) x<yandy <z>2 <z. 

It also has an important additional property: 

(4) for any xz andy, either x < yory < x. 

Property (4’) says that any two real numbers are comparable with 
respect to the relation in question, and it leads us to call the order 
relation on the real line a total (or linear) order relation. Show by an 
example that this property is not possessed by set inclusion. It is 
for this reason that set inclusion is called a partial order relation. 


3. (a) Let U be the single-element set {1}. There are two subsets, 
the empty set § and {1} itself. If A and B are arbitrary subsets 
of U, there are four possible relations of the form A C B. 
Count the number of true relations among these. 

(b) Let U be the set {1,2}. There are four subsets. List them. 
If A and B are arbitrary subsets of U, there are 16 possible 
relations of the form A € B. Count the number of true ones. 

(c) Let U be the set {1, 2, 3}. There are 8 subsets. What are 
they? There are 64 possible relations of the form A C B. 
Count the number of true ones. 

(d) Let U be the set {1, 2,..., mn} for an arbitrary positive 
integer n. How many subsets are there? How many possible 
relations of the form A CB are there? Can you make an 
informed guess as to how many of these are true? 


2. THE ALGEBRA OF SETS 


In this section we consider several useful ways in which sets can be 
combined with one another, and we develop the chief properties of these 
operations of combination. 

As we emphasized above, all the sets we mention in this section are 
assumed to be subsets of our universal set U. U is the frame of reference, 
or the unwerse, for our present discussions. In our later work the frame 
of reference in a particular context will naturally depend on what ideas 
we happen to be considering. If we find ourselves studying sets of real 
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numbers, then U is the set RF of all real numbers. If we wish to study 
sets of complex numbers, then we take U to be the set C of all complex 
numbers. We sometimes want to narrow the frame of reference and to 
consider (for instance) only subsets of the closed unit interval [0,1], or 
of the closed unit disc {z:|z| < 1}, and in these cases we choose U 
accordingly. Generally speaking, the universal set U is at our disposal, 
and we are free to select it to fit the needs of the moment. For the 
present, however, U is to be regarded as a fixed but arbitrary set. This 
generality allows us to apply the ideas we develop below to any situation 
which arises in our later work. 


Fig. 1. Set inclusion. Fig. 2. The union of A and B. 


It is extremely helpful to the imagination to have a geometric 
picture available in terms of which we can visualize sets and operations 
on sets. A convenient way to accomplish this is to represent U by a 
rectangular area in a plane, and the elements which make up U by the 
points of this area. Sets can then be pictured by areas within this 
rectangle, and diagrams can be drawn which illustrate operations on sets 
and relations between them. For instance, if A and B are sets, then Fig. 
1 represents the circumstance that A is a subset of B (we think of each 
set as consisting of all points within the corresponding closed curve). 
Diagrammatic thought of this kind is admittedly loose and imprecise; 
nevertheless, the reader will find itinvaluable. Nomathematics, however 
abstract it may appear, is ever carried on without the help of mental 
images of some kind, and these are often nebulous, personal, and diffi- 
cult to describe. 

The first operation we discuss in the algebra of sets is that of forming 
unions. The union of two sets A and B, written A U B, is defined 
to be the set of all elements which are in either A or B (including 
those which are in both). A UB is formed by lumping together the 
elements of A and those of B and regarding them as constituting a single 
set. In Fig. 2, 4 UB is indicated by the shaded area. The above 
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definition can also be expressed symbolically: 
AUB = {x:z4¢A orze B}. 
The operation of forming unions is commutative and associative: 
AUB=BUA and AU(BUQ=(AUB) UC. 
It has the following additional properties: 
AUVUA=A,AUG=A,andAUU=U. 


We also note that 
ACBSAVUB=B, 


so set inclusion can be expressed in terms of this operation. 

Our next operation is that of forming intersections. The tnter- 
section of two sets A and B, written A () B, is the set of all elements which 
are in both A and B. In symbols, 


AC\B = {x:c¢A andzeB}. 


AC\B is the common part of the 
sets A and B. In Fig. 3, AM Bis 
represented by the shaded area. If 
A QC Bis non-empty, we express this 
by saying that A intersects B. If, 
on the other hand, it happens that A 
and B have no common part, or 
equivalently that A (\ B = @, then Fig. 3. The intersection of A and B. 
we say that A does not intersect B, or 

that A and B are disjoint; and a class of sets in which all pairs of distinct 
sets are disjoint is called a disjoint class of sets. The operation of form- 
ing intersections is also commutative and associative: 


ANB=BN\A and AN(BNC) = (ANB)NE. 
It has the further properties that 
ANA=A,ANG=G,andANU=A; 


and since 
ACBeANB=A, 


we see that set inclusion can also be expressed in terms of forming 
intersections. 

We have now defined two of the fundamental operations on sets, and 
we have seen how each is related to set inclusion. The next obvious step 
is to see how they are related to one another. The facts here are given by 
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the distributive laws: 


ANM(BUC)=(ANB)U(ANC) 
and AU(BNOQ)=AUVUBNAYVU OC. 


These properties depend only on simple logic applied to the meanings of 


Fig.4. AU(BNC) = (AUB) N(A UC). 


the symbols involved. For instance, the first of the two distributive 
laws says that an element is in A and is in B or C precisely when it is in 
A and B orisin A and C. We can convince ourselves intuitively of the 
validity of these laws by drawing pictures. The second distributive 
law is illustrated in Fig. 4, where A U (B (\ C) is formed on the left by 
shading and (A \U B)(\ (A UC) on the right by cross-shading. A 
moment’s consideration of these dia- 
grams ought to convince the reader 
that one obtains the same set in each 
case. 

The last of our major operations 
on sets is the formation of comple- 
ments. The complement of a set A, 
denoted by A’, is the set of all ele- 
ments which are not in A. Since 
the only elements we consider are 

Fig. 5. The complement of A. those which make up U, it goes with- 

out saying—but it ought to be said 

—that A’ consists of all those elements in U which are not in A. 
Symbolically, 


A’ = {a:a¢ A}. 


Figure 5 (in which A’ is shaded) illustrates this operation. The operation 
of forming complements has the following obvious properties: 


(A’)' = A, @’ = U, U' = 6, 
AUA'=U,and AN A’ =O. 
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Further, it is related to set inclusion by 
ACBeB’ CA’ 
and to the formation of unions and intersections by 
(AU BY = A'OB' and (A CT) BY = A’ UB". (1) 


The first, equation of (1) says that an element is not in either of two sets 
precisely when it is outside of both, and the second says that it is not in 
both precisely when it is outside of one or the other. 

The operations of forming unions and intersections are primarily 
binary operations; that is, each is a process which applies to a pair of sets 
and yields a third. We have emphasized this by our use of parentheses 
to indicate the order in which the operations are to be performed, as in 
(A, U Az) U As, where the parentheses direct us first to unite A; and 
A», and then to unite the result of this with A;. Associativity makes it 
possible to dispense with parentheses in an expression like this and to 
write A; \U A, A3, where we understand that these sets are to be 
united in any order and that the order in which the operations are 
performed is irrelevant. Similar remarks apply to Ai Az Ag. 
Furthermore, if {Ai, Az, . .., An} is any finite class of sets, then we 
can form 


A,\UA2U+>++UAn and AiMA:M\ +++ (An 


in much the same way without any ambiguity of meaning whatever. 
In order to shorten the notation, we let J = {1, 2, ... , mn} be the set 
of subscripts which index the sets under consideration. J is called the 
index set. ‘We then compress the symbols for the union and intersection 
just mentioned to Ur A: and Mr A:. As long as it is quite clear what 
the index set is, we can write this union and intersection even more 
briefly, in the form UA; and (,A;. For the sake of both brevity and 
clarity, these sets are often written UZ, A; and M2, Ax. 

These extensions of our ideas and notations don’t reach nearly far 
enough. It is often necessary to form unions and intersections of large 
(really large!) classes of sets. Let {A,} be an entirely arbitrary class 
of sets indexed by a set I of subscripts. Then 


User Ay = {z:2 € A; for at least one ze I} 
and Ver Ai = {a:re A; for every 7€ } 


define their union and intersection. As above, we usually abbreviate these 
notations to \U,A; and ™,A,; and if the class {A;} consists of a sequence 
of sets, that is, if {A;} = {Ai, As, As, ...}, then their union and 
intersection are often written in the form U2, Acand M2, A: Observe 
that we did not require the class {A,} to be non-empty. If it does 
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happen that this class is empty, then the above definitions give (remem- 
bering that all sets are subsets of U) U.A;: = @ and M;4; = U. The 
second of these facts amounts to the following statement: if we require 
of an element that it belong to each set in a given class, and if there are no 
sets present in the class, then every element satisfies this requirement. 
If we had not made the agreement that the only elements under consider- 
ation are those in U, we would not have been able to assign a meaning to 
the intersection of an empty class of sets. A moment’s consideration 
makes it clear that Eqs. (1) are valid for arbitrary unions and intersections: 


(U;A.)’ = AY and (A.A)! = U;AY. (2) 


It is instructive to verify these equations for the case in which the class 
{A,} is empty. 

We conclude our treatment of the general theory of sets with a 
brief discussion of certain special classes of sets which are of consider- 
able importance in topology, logic, and measure theory. We usually 
denote classes of sets by capital letters in boldface. 

First, some general remarks which will be useful both now and 
later, especially in connection with topological spaces. We shall often 
have occasion to speak of finite unions and finite intersections, by which 
we mean unions and intersections of finite classes of sets, and by a 
finite class of sets we always mean one which is empty or consists of n 
sets for some positive integer n. If we say that a class A of sets is closed 
under the formation of finite unions, we mean that A contains the 
union of each of its finite subclasses; and since the empty subclass 
qualifies as a finite subclass of A, we see that its union, the empty set, 
is necessarily an element of A. In the same way, a class of sets which is 
closed under the formation of finite intersections necessarily contains 
the universal set. 

Now for the special classes of sets mentioned above. For the 
remainder of this section we specifically assume that the universal set 
U is non-empty. A Boolean algebra of sets is a non-empty class A of 
subsets of U which has the following properties: 

(1) AandBeASAU BEA; 

(2) AandBeASAN BEA; 

(3) At A> A’EeA. 

Since A is assumed to be non-empty, it must contain at least one set A. 
Property (3) shows that A’ is in A along with A, and since A (\ A’ = @ 
and A U A’ = U, (1) and (2) guarantee that A contains the empty set 
and the universal set. Since the class consisting only of the empty set 
and the universal set is clearly a Boolean algebra of sets, these two 
distinct sets are the only ones which every Boolean algebra of sets must 
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contain. It is equally clear that the class of all subsets of U is also a 
Boolean algebra of sets. There are many other less trivial kinds, and 
their applications are manifold in fields of study as diverse as statistics 
and electronics. 

Let A be a Boolean algebra of sets. It is obvious that if 
{A,, Az, ... , An} is a non-empty finite subclass of A, then 


A,UVUAL,U+:°>:UA, and Aif\Asl\*'+:1\An 


are both sets in A; and since A contains the empty set and the universal 
set, it is easy to see that A is a class of sets which is closed under the 
formation of finite unions, finite intersections, and complements. We 
now go in the other direction, and let A be a class of sets which is closed 
under the formation of finite unions, finite intersections, and comple- 
ments. By these assumptions, A automatically contains the empty set 
and the universal set, so it is non-empty and is easily seen to be a Boolean 
algebra of sets. We conclude from these remarks that Boolean algebras 
of sets can be described alternatively as classes of sets which are closed 
under the formation of finite unions, finite intersections, and comple- 
ments. It should be emphasized once again that when discussing 
Boolean algebras of sets we always assume that the universal set is non- 
empty. 

One final comment. We speak of Boolean algebras of sets because 
there are other kinds of Boolean algebras than those which consist of 
sets, and we wish to preserve the distinction. We explore this topic 
further in our Appendix on Boolean algebras. 


Problems 


1. If {A.} and {B;} are two classes of sets such that {A;} C {B;}, 
show that U;A; C U,B; and N;B; C M;Ai. 

2. The difference between two sets A and B, denoted by A — B, is the set 
of all elements in A and not in B; thus A —- B = Af B’. Show 
the following: 


A—~B=A-—(AMB)=(AUB)-B; 
(A-—B)-C=A-(BUO); 
A-(B—-C)=(A-B)U(ANO); 
(AUB)-C=(A-C)U(B-Q); 
A—(BUC)=(A—B)N(A-C). 


3. The symmetric difference of two sets A and B, denoted by A AB, 
is defined by AA B = (A — B) U (B — A); it is thus the union of 
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their differences in opposite orders. Show the following: 


AA(BAC) = (AAB)AC; 
AAG=A;AAA=PD; 
AAB=BAA; 

ANM (BAC) =(ANBACANC). 


4. A ring of sets is a non-empty class A of sets such that if A and B are 
in A, then A A Band A(\B arealsoin A. Show that A must also 
contain the empty set, A \U B, and A — B. Show that if a non- 
empty class of sets contains the union and difference of any pair of its 
sets, then it is a ring of sets. Show that a Boolean algebra of sets is 
a ring of sets. 

5. Show that the class of all finite subsets (including the empty set) of 
an infinite set is a ring of sets but is not a Boolean algebra of sets. 

6. Show that the class of all finite unions of closed-open intervals on the 
real line is a ring of sets but is not a Boolean algebra of sets. 

7. Assuming that the universal set U is non-empty, show that Boolean 
algebras of sets can be described as rings of sets which contain U. 


3. FUNCTIONS 


Many kinds of functions occur in topology, in a great variety of 
situations. In our work we shall need the full power of the general con- 
cept of a function, and since its modern meaning is much broader and 
deeper than its elementary meaning, we discuss this concept in con- 
siderable detail and develop its main abstract properties. 

Let us begin with a brief inspection of some simple examples. Con- 
sider the elementary function 


y= 2? 


of the real variable x. What do we have in mind when we call this a 
function and say that y is a function of z? In a nutshell, we are drawing 
attention to the fact that each real number z has linked to it a specific 
real number y, which can be calculated according to the rule (or law of 
correspondence) given by the formula. We have here a process which, 
applied to any real number z, does something to it (squares it) to produce 
another real number y (the square of x). Similarly, 


y = x? — 32 and y = (2? + 1)" 


are two other simple functions of the real variable z, and each is given 
by a rule in the form of an algebraic expression which specifies the exact 
manner in which the value of y depends on the value of z. 
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The rules for the functions we have just mentioned are expressed by 
formulas. In general, this is possible only for functions of a very simple 
kind or for those which are sufficiently important to deserve special 
symbols of their own. Consider, for instance, the function of the real 
variable z defined as follows: for each real number 2, write x as an infinite 
decimal (using the scheme of decimal expansion in which infinite chains 
of 9’s are avoided—in which, for example, 4 is represented by .25000 . . . 
rather than by .24999 . . .); then let y be the fifty-ninth digit after 
the decimal point. There is of course no standard formula for this, 
but nevertheless it is a perfectly respectable function whose rule is given 
by a verbal description. On the other hand, the function y = sin z of 
the real variable z is so important that its rule, though fully as compli- 
cated as the one just defined, is assigned the special symbol sin. When 
discussing functions in general, we want to allow for all sorts of rules and 
to talk about them all at once, so we usually employ noncommittal 
notations like y = f(x), y = g(x), and so on. 

Each of the functions mentioned above is defined for all real numbers 
x. The example y = 1/z shows that this restriction is much too severe, 
for this function is defined only for non-zero values of x. Similarly, 
y = log z is defined only for positive values of z, and y = sin—' z only for 
values of x which lie in the interval [—1,1]. Whatever our conception 
of a function may be, it should certainly be broad enough to include 
examples like these, which are defined only for some values of the real 
variable x. 

In real analysis the notion of function is introduced in the following 
way. Let X be any non-empty set of real numbers. We say that a 
function y = f(x) is defined on X if the rule f associates a definite real 
number y with each real number x in X. The specific nature of the 
rule f is totally irrelevant to the concept of a function. The set X is 
called the domain of the given function, and the set Y of all the values it 
assumes is called its range. If we speak of complex numbers here 
instead of real numbers, we have the notion of function as it is used in 
complex analysis. 

This point of view toward functions is actually a bit more general 
than is needed for the aims of analysis, but it isn’t nearly general 
enough for our purposes. The sets X and Y above were taken to be 
sets of numbers. If we now remove even this restriction and allow X 
and Y to be completely arbitrary non-empty sets, then we arrive at the 
most inclusive concept of a function. By way of illustration, suppose 
that X is the set of all squares in a plane and that Y is the set of all 
circles in the same plane. We can define a function y = f(z) by requiring 
that the rule f associate with each square x that circle y which is inscribed 
in it. In general, there is no need at all for either X or Y to be a set of 
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numbers. All that is really necessary for a function is two non-empty 
sets X and Y and a rule f which is meaningful and unambiguous in 
assigning to each element x in X a specific element y in Y. 

With these preliminary descriptive remarks, we now turn to the 
rather abstract but very precise ideas they are intended to motivate. 

A function consists of three objects: two non-empty sets X and Y 
(which may be equal, but need not be) and a rule f which assigns to each 
element x in X a single fully determined element y in Y. The y which 
corresponds in this way to a given z is usually written f(z), and is called 
the image of x under the rule f, or the value of f at the element x. This 


f 


Fig. 6. A way of visualizing mappings. 


notation is supposed to be suggestive of the idea that the rule f takes the 
element x and does something to it to produce the element y = f(z). 
The rule f is often called a mapping, or transformation, or operator, to 
amplify this concept of it. We then think of f as mapping 2’s to y’s, or 
transforming 2’s into y’s, or operating on z’s to produce y’s. The set X 
is called the domain of the function, and the set of all f(x)’s for all 2z’s 
in X is called its range. A function whose range consists of just one 
element is called a constant function. 

We often denote by f:X — Y the function with rule f, domain X, 
and range contained in Y. This notation is useful because the essential 
parts of the function are displayed in a manner which emphasizes that it 
is a composite object, the central thing being the rule or mapping f. 
Figure 6 gives a convenient way of picturing this function. On the 
left, X and Y are different sets, and on the right, they are equal—in which 
case we usually refer to f as a mapping of X into itself. If it is clear 
from the context what the sets X and Y are, or if there is no real need to 
specify them explicitly, it is common practice to identify the function 
{:X— Y with the rule f, and to speak of f alone as if it were the function 
under consideration (without mentioning the sets X and Y). 

It sometimes happens that two perfectly definite sets X and Y are 
under discussion and that a mapping of X into Y arises which has nu 
natural symbol attached to it. If there is no necessity to invent a 
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symbol for this mapping, and if it is quite clear what the mapping is, it is 
often convenient to designate it by x— y. Accordingly, the function 
y = x? mentioned at the beginning of this section can be written as 
x— x? or x — y (where y is understood to be the square of z). 

A function f is called an extension of a function g (and g is called a 
restriction of f) if the domain of f contains the domain of g and f(x) = g(z) 
for each x in the domain of g. 

Most of mathematical analysis, both classical and modern, deals 
with functions whose values are real numbers or complex numbers. 


Fig. 7. The inverse of a mapping. 


This is also true of those parts of topology which are concerned with 
the foundations of analysis. If the range of a function consists of real 
numbers, we call it a real function; similarly, a complex function is one 
whose range consists of complex numbers. Obviously, every real function 
is also complex. We lay very heavy emphasis on real and complex func- 
tions throughout our work. 

As a matter of usage, we generally prefer to reserve the term function 
for real or complex functions and to speak of mappings when dealing 
with functions whose values are not necessarily numbers. 

Consider a mapping f:X — Y. When we call f a mapping of X 
into Y, we mean to suggest by this that the elements f(x)—as x varies 

- over all the elements of X—need not fill up Y; but if it definitely does 
happen that the range of f equals Y, or if we specifically want to assume 
this, then we call f a mapping of X onto Y. If two different elements 
in X always have different images under f, then we call f a one-to-one 
mapping of X into Y. If f:X — Y is both onto and one-to-one, then 
we can define its inverse mapping f—1:Y — X as follows: for each y in Y, 
we find that unique element x in X such that f(r) = y (zx exists and is 
unique since f is onto and one-to-one); we then define z to be f~'(y). 
The equation x = f—'(y) is the result of solving y = f(r) for z in just the 
same way as x = log y is the result of solving y = e* for x. Figure 7 
illustrates the concept of the inverse of a mapping. 


18 Topology 


If f is a one-to-one mapping of X onto Y, it will sometimes be con- 
venient to subordinate the conception of f as a mapping sending 2’s 
over to y’s and to emphasize its role as a link between 2’s and y’s. Each 
x has linked to it (or has corresponding to it) precisely one y = f(x); 
and, turning the situation around, each y has linked to it (or has corre- 
sponding to it) exactly one x = f—!(y). When we focus our attention 
on this aspect of a mapping which is one-to-one onto, we usually call it 
a one-to-one correspondence. Thus f is a one-to-one correspondence 
between X and Y, and f—! is a one-to-one correspondence between Y 
and X. 

Now consider an arbitrary mapping f:X — Y. The mapping f, 
which sends each element of X over to an element of Y, induces the 
following two important set mappings. If A is a subset of X, then its 
image f(A) is the subset of Y defined by 


f(A) = {f(@):2€ A}, 


and our first set mapping is that which sends each A over to its corre- 
sponding f(A). Similarly, if B is a subset of Y, then its inverse image 
f7'(B) is the subset of X defined by 


S-(B) = {x:f(e) € B}, 


and the second set mapping pulls each B back to its corresponding 
jf-'(B). It is often essential for us to know how these set mappings 
behave with respect to set inclusion and operations on sets. We develop 
most of their significant features in the following two paragraphs. 

The main properties of the first set mapping are: 


{®=9; f(XMCY; 

A; © As=f(Ai) C f(A2); 
f(A) = UF(Ad; (1) 
S(OiAi) S Of(A)). 


The reader should convince himself of the truth of these statements. 
For instance, to prove (1) we would have to prove first that f(U;A,) isa 
subset of U,f(A:), and second that U,f(A,) is a subset of f(U;A)). 
A proof of the first of these set inclusions might run as follows: an element 
in f(U,A,) is the image of some element in U;A,, therefore it is the image 
of an element in some A,, therefore it is in some f(A;), and so finally it is in 
U,f(A,). The irregularities and gaps which the reader will notice in the 
above statements are essential features of this set mapping. For exam- 
ple, the image of an intersection need not equal the intersection of the 
images, because two disjoint sets can easily have images which are not 
disjoint. Furthermore, without special assumptions (see Problem 6) 
nothing can be said about the relation between f(A)’ and f(A’). 
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The second set mapping is much better behaved. Its properties 
are satisfyingly complete, and can be stated as follows: 


$°@ =8; fMUY) = xX; 
Bi C Bz, => f-\(Bi) C f-*(B:); 


SOB) = Oif Bd); (2) 
FOO Bs) = Of (Ba); (3) 
f(B) = f(By. (4) 


Again, the reader should verify each of these statements for himself. 


f g 


Fig. 8. Multiplication of mappings. 


We discuss one more concept in this section, that of the multiplication 
(or composition) of mappings. If y = f(z) = 2? + 1 and 


z= g(y) = sin y, 


then these two functions can be put together to form a single function 
defined by z = (gf)(x) = g(f(x)) = g(x? + 1) = sin (2? +1). One of 
the most important tools of calculus (the chain rule) explains how to dif- 
ferentiate functions of this kind. This manner of multiplying functions 
together is of basic importance for us as well, and we formulate it in 
general as follows. Suppose that f:X — Y and g:Y — Z are any two 
mappings. We define the product of these mappings, denoted by 
gf :X — Z, by (gf)(x) = g(f(x)). In words: an element x in X is taken 
by f to the element f(x) in Y, and then g maps/f(x) to g(f(x))in Z. Figure 
8 isa picture of this process. We observe that the two mappings involved 
here are not entirely arbitrary, for the set Y which contains the range of 
the first equals the domain of the second. More generally, the product 
of two mappings is meaningful whenever the range of the first is con- 
tained in the domain of the second. We have regarded f as the first 
mapping and g as the second, and in forming their product gf, their 
symbols have gotten turned around. This is a rather unpleasant 
phenomenon, for which we blame the occasional perversity of mathe- 
matical symbols. Perhaps it will help the reader to keep this straight 
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in his mind if he will remember to read the product gf from right to left: 
first apply f, then g. 


Problems 


1. Two mappings f:X — Y and g:X — Y are said to be equal (and we 
write this f = g) if f(z) = g(x) for every x in X. Let f, g, andh 
be any three mappings of a non-empty set X into itself, and show 
that multiplication of mappings is associative in the sense that 
S(gh) = (fg)h. 

2. Let X be a non-empty set. The identity mapping ix on X is the 
mapping of X onto itself defined by 7x(x) = x for every x. Thus tx 
sends each element of X to itself; that is, it leaves fixed each element 
of X. Show that fix = txf = f for any mapping f of X into itself. 
If f is one-to-one onto, so that its inverse f-! exists, show that 
ff! = f-lf = tx. Show further that f-! is the only mapping of X 
into itself which has this property; that is, show that if g is a mapping 
of X into itself such that fg = gf = ix, then g =f! (hint: 
g = gix = off") = (Nf = txf =f, or 

g = txg = (f"f)g = f-'(f9) = fox = f-). 
3. Let X and Y be non-empty sets and f a mapping of X into Y. Show 


the following: 
(a) f is one-to-one = there exists a mapping g of Y into X such 


that gf = tx; 
(b) f is onto — there exists a mapping h of Y into X such that 
fh = ty. 


4. Let X be a non-empty set and f a mapping of X into itself. Show 
that f is one-to-one onto = there exists a mapping g of X into itself 
such that fg = gf = ix. If there exists a mapping g with this 
property, then there is only one such mapping. Why? 

5. Let X be a non-empty set, and let f and g be one-to-one mappings 
of X onto itself. Show that fg is also a one-to-one mapping of X 
onto itself and that (fg)—! = g“'f-. 

6. Let X and Y be non-empty sets and f a mapping of X into Y. If 
A and B are, respectively, subsets of X and Y, show the following: 
(a) ff-(B) C B, and ff-(B) = B is true for all B = f is onto; 

(b) A Cf-¥(A),and A = f-'f(A) is true for all A = f is one-to-one; 

(c) f(Ai CV Ae) = f(A1) ( f(Az) is true for all Ai and Ag of is 
one-to-one; 

(d) f(A)’ Cf(A’) is true for all A =f is onto; 

(e) if f is onto—so that f(A)’ Cf(A’) is true for all A—then 
(A) = f(A’) is true for all A =f is also one-to-one. 
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4, PRODUCTS OF SETS 


We shall often have occasion to weld together the sets of a given 
class into a single new set called their product (or their Cartesian product). 
The ancestor of this concept is the coordinate plane of analytic geometry, 
that is, a plane equipped with the usual rectangular coordinate system. 
We give a brief description of this fundamental idea with a view to 
paving the way for our discussion of products of sets in general. 

First, a few preliminary comments about the real line. We have 
already used this term several times without any explanation, and of 
course what we mean by it is an ordinary geometric straight line (see 
Fig. 9) whose points have been identified with—or coordinatized by—the 


-3 -2 -1 0 1 2 3 
Fig. 9. The real line. 


set R of all real numbers. We use the letter R to denote the real line 
as well as the set of all real numbers, and we often speak of real numbers 
as if they were points on the real line, and of points on the real line as if 
they were real numbers. Let no one be deceived into thinking that the 
real line is a simple thing, for its structure is exceedingly intricate. Our 
present view of it, however, is as naive and uncomplicated as the picture 
of it given in Fig. 9. Generally speaking, we assume that the reader is 
familiar with the simpler properties of the real line—those relating to 
inequalities (see Problem 1-2) and the basic algebraic operations of 
addition, subtraction, multiplication, and division. One of the most 
significant facts about the real number system is perhaps less well 
known. This is the so-called least upper bound property, which asserts 
that every non-empty set of real numbers which has an upper bound has 
a least upper bound. It is an easy consequence of this that every non- 
empty set of real numbers which has a lower bound has a greatest lower 
bound. All these matters can be developed rigorously on the basis of a 
small number of axioms, and detailed treatments can often be found in 
books on elementary abstract elgebra. 

To construct the coordinate plane, we now proceed as follows. We 
take two identical replicas of the real line, which we call the x azis 
and the y azis, and paste them on a plane at right angles to one another 
in such a way that they cross at the zero point on each. The usual 
picture is given in Fig. 10. Now let P be a point in the plane. We 
project P perpendicularly onto points P, and P, on the axes. If xand y 
are the coordinates of P, and P, on their respective axes, this process 
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leads us from the point P to the uniquely determined ordered pair (z,y) 
of real numbers, where x and y are called the x coordinate and y coordi- 
nate of P. We can reverse the process, and, starting with the ordered 
pair of real numbers, we can recapture the point. This is the manner in 
which we establish the familiar one-to-one correspondence between 
points P in the plane and ordered pairs (z,y) of real numbers. In fact, 
we think of a point in the plane (which is a geometric object) and its 
corresponding ordered pair of real numbers (which is an algebraic 
object) as being—to all intents and purposes—identical with one another. 
The essence of analytic geometry lies 
y axis in the possibility of exploiting this 
identification by using algebraic 
tools in geometric arguments and 
giving geometric interpretations to 

_ 9g P=(x,y) algebraic calculations. 

The conventional attitude to- 
ward the coordinate plane in ana- 
lytic geometry is that the geometry 
is the focus of interest and the alge- 
bra of ordered pairs is only a con- 
venient tool. Here we reverse this 
point of view. For us, the coordinate 

Fig. 10. The coordinate plane. plane is defined to be the set of all 

ordered pairs (x,y) of real numbers. 

We can satisfy our desire for visual images by using Fig. 10 as a picture 

of this set and by calling such an ordered pair a point, but this geo- 
metric language is more a convenience than a necessity. 

Our notation for the coordinate plane is R X R, or R*. This 
symbolism reflects the idea that the coordinate plane is the result of 
“multiplying together” two replicas of the real line R. 

It is perhaps necessary to comment on one possible source of mis- 
understanding. When we speak of R? as a plane, we do so only to 
establish an intuitive bond with the reader’s previous experience in ana- 
lytic geometry. Our present attitude is that R? is a pure set and has no 
structure whatever, because no structure has yet been assigned to it. 
We remarked earlier (with deliberate vagueness) that a space is a set 
to which has been added some kind of algebraic or geometric structure. 
In Sec. 15 we shall convert the set R? into the space of analytic geometry 
by defining the distance between any two points (21,41) and (r2,y2) to be 


V (a1 — &2)? + (yr — ye)? 


This notion of distance endows the set R? with a certain “spatial” char- 
acter, which we shall recognize by calling the resulting space the Euclidean 
plane instead of the coordinate plane. 


By 
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We assume that the reader is fully acquainted with the way in 
which the set C of all complex numbers can be identified (as a set) 
with the coordinate plane R?. If z is a complex number, and if z has 
the standard form x + iy where z and y are real numbers, then we 
identify z with the ordered pair (z,y), and thus with an element of R?. 
The complex numbers, however, are much more than merely a set. 
They constitute a number system, with operations of addition, multi- 
plication, conjugation, etc. When the coordinate plane R? is thought 
of as consisting of complex numbers and is enriched by the algebraic 
structure it acquires in this way, 
it is called the complex plane. The X 
letter C is used to denote either 
the set of all complex numbers or 
the complex plane. We shall make 
a space out of the complex plane 
in Sec. 9. 

Suppose now that X, and X2 

are any two non-empty sets. By 
analogy with our above discussion, 
their product X; X X: is defined to *1 x; 
be the set of all ordered pairs (11,22), ‘Fig. 11. Away of visualizing X, X Xo. 
where x, is in X; and 2 is in Xe. 
In spite of the arbitrary nature of X; and X2, their product can be repre- 
sented by a picture (see Fig. 11) which is loosely similar to the usual 
picture of the coordinate plane. The term product is applied to this set, 
and it is thought of as the result of ‘multiplying together” X; and X2, for 
the following reason: if X, and X; are finite sets with m and n elements, 
then (clearly) X: X X2 has mn elements. If f:X:— X2 is a mapping 
with domain X; and range in X2, its graph is that subset of X; X Xe 
which consists of all ordered pairs of the form (21,f(z1)). We observe 
that this is an appropriate generalization of the concept of the graph of 
a function as it occurs in elementary mathematics. 

This definition of the product of two sets extends easily to the case 
of n sets for any positive integer n. If X1, Xo, ... , X, are non-empty 
sets, then their product X1 X X2X +++ X Xnis the set of all ordered 
n-tuples (x1, 2, . . . , Zn), Where x; is in X; for each subscript ¢. If the 
X,’s are all replicas of a single set X, that is, if 


X,=X,2=:-:> =X, =X, 


then their product is usually denoted by the symbol X*. 

These ideas specialize directly to yield the important sets R* and 
C*. R'is just R, the real line, and R? is the coordinate plane. R*—the 
set of all ordered triples of real numbers—is the set which underlies 
solid analytic geometry, and we assume that the reader is familiar with 
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the manner in which this set arises, through the introduction of a rec- 
tangular coordinate system into ordinary three-dimensional space. We 
can draw pictures here just as in the case of the coordinate plane, and we 
can use geometric language as much as we please, but it must be under- 
stood that the mathematics of this set is the mathematics of ordered 
triples of real numbers and that the pictures are merely an aid to the 
intuition. Once we fully grasp this point of view, there is no difficulty 
whatever in advancing at once to the study of the set R* of all ordered 
n-tuples (%1, 2, . .. , Zn) of real numbers for any positive integer n. 
It is quite true that when x is greater than 3 it is no longer possible to 
draw the same kinds of intuitively rich pictures, but at worst this is 
merely an inconvenience. We can (and do) continue to use suggestive 
geometric language, so all is not lost. The set C* is defined similarly: 
it is the set of all ordered n-tuples (21, z2, . . . , 2n) of complex numbers. 
Each of the sets R" and C* plays a prominent part in our later work. 

We emphasized above that for the present the coordinate plane is to 
be considered as merely a set, and not a space. Similar remarks apply 
to R™ and C*. In due course (in Sec. 15) we shall impart form and 
content to each of these sets by suitable definitions. We shall convert 
them into the Euclidean and unitary n-spaces which underlie and motivate 
so many developments in modern pure mathematics, and we shall 
explore some aspects of their algebraic and topological structure to the 
very last pages of this book. But as of now—and this is the point we 
insist on—neither one of these sets has any structure at all. 

As the réader doubtless suspects, it is not enough that we consider 
only products of finite classes of sets. The needs of topology compel 
us to extend these ideas to arbitrary classes of sets. 

We defined the product X; X X2X ++: X Xn to be the set of 
all ordered n-tuples (21, 22, ..., 2.) such that 2; is in X; for each 
subscript 7. To see how to extend this definition, we reformulate it as 
follows. We have an index set J, consisting of the integers from 1 to n, 
and corresponding to each index (or subscript) 7 we have a non-empty 
set X; The n-tuple (x1, z2, . . . , Zn) is simply a function (call it 2) 
defined on the index set J, with the restriction that its value x(t) = 2 is 
an element of the set X; for each tin J. Our point of view here is that 
the function x is completely determined by, and is essentially equivalent 
to, the array (x1, t2, . . . , tn) of its values. 

The way is now open for the definition of products in their full 
generality. Let {X;} be a non-empty class of non-empty sets, indexed by 
the elements 7 of an index set J. The sets X; need not be different 
from one another; indeed, it may happen that they are all identical 
replicas of a single set, distinguished only by different indices. The 
product of the sets X;, written P;,, X;, is defined to be the set of all 
functions z defined on J such that (7) is an element of the set X, for 
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each index 7. We call X; the ith coordinate set. When there can be no 
misunderstanding about the index set, the symbol P;,; X; is often abbrevi- 
ated to P;X, The definition we have just given requires that each 
coordinate set be non-empty before the product carf be formed. It will 
be useful if we extend this definition slightly by agreeing that if any of 
the X,’s are empty, then P;X; is also empty. 

This approach to the idea of the product of a class of sets, by means of 
functions defined on the index set, is useful mainly in giving the definition. 
In practice, it is much more convenient to use the subscript notation 
x; instead of the function notation z(z). We then interpret the product 
P,X;as made up of elements z, each of which is specified by the exhibited 
array {2z:} of its values in the respective coordinate sets X;. We call 
x; the ith coordinate of the element x = {2;}. 

The mapping p; of the product P;X; onto its ith coordinate set X; 
which is defined by p(x) = 2,—that is, the mapping whose value 
at an arbitrary element of the product is the 7th coordinate of that element 
—is called the projection onto the ith coordinate set. The projection 
p; selects the 7th coordinate of each element in its domain. There is 
clearly one projection for each element of the index set J, and the set of 
all projections plays an important role in the general theory of topological 
spaces. 


Problems 


1, The graph of a mapping f:X — Y is a subset of the product X X FY. 
What properties characterize the graphs of mappings among all 
subsets of X X Y? 

2. Let X and Y be non-empty sets. If A; and A: are subsets of X, and 
B, and B, subsets of Y, show the following: 


(Ai x B,) CY (As »4 B.) = (Ay a) Az) x (B, C\ B,); 

(Ai X Bi) — (Ae X Be) = (Ai — Az) X (Bi — Be) 
U (AiO Az) X (Bi — B:) 
U (Ai — As) X (Br 0 B)). 


3. Let X and Y be non-empty sets, and let A and B be rings of subsets 
of X and Y, respectively. Show that the class of all finite unions of 
sets of the form A X B with A ¢ A and B ¢ Bisa ring of subsets of 
XX Y. 


5. PARTITIONS AND EQUIVALENCE RELATIONS 


In the first part of this section we consider a non-empty set X, and 
we study decompositions of X into non-empty subsets which fill it out 
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and have no elements in common with one another. We give special 
attention to the tools (equivalence relations) which are normally used to 
generate such decompositions. 

A partition of X is a disjoint class {X;,} of non-empty subsets of X 
whose union is the full set X itself. The X,’s are called the partition sets. 
Expressed somewhat differently, a partition of X is the result of splitting 
it, or subdividing it, into non-empty subsets in such a way that each 
element of X belongs to one and only one of the given subsets. 

If X is the set {1, 2, 3, 4, 5}, then {1, 3, 5}, {2,4} and {1, 2, 3}, (4, 5} 
are two different partitions of X. If X is the set FR of all real numbers, 
then we can partition X into the set of all rationals and the set of all 
irrationals, or into the infinitely many closed-open intervals of the form 
[n, » + 1) where n is an integer. If X is the set of all points in the 
coordinate plane, then we can partition X in such a way that each 
partition set consists of all points with the same zx coordinate (vertical 
lines), or so that each partition set consists of all points with the same 
y coordinate (horizontal lines). 

Other partitions of each of these sets will readily occur to the reader. 
In general, there are many different ways in which any given set can 
be partitioned. These manufactured examples are admittedly rather 
uninspiring and serve only to make our ideas more concrete. Later in 
this section we consider some others which are more germane to our 
present purposes. 

A binary relation in the set X is a mathematical symbol or verbal 
phrase, which we denote by R in this paragraph, such that for each 
ordered pair (x,y) of elements of X the statement z Ry is meaningful, 
in the sense that it can be classified definitely as true or false. For such 
a binary relation, x R y symbolizes the assertion that x is related by R to 
y, and x Ry the negation of this, namely, the assertion that z is not 
related by R to y Many examples of binary relations can be given, 
some familiar and others less so, some mathematical and others not. 
For instance, if X is the set of all integers and R is interpreted to mean 
‘fs less than,” which of course is usually denoted by the symbol <, then 
we clearly have 4 < 7 and 5¢ 2. We have been speaking of binary 
relations, which are so named because they apply only to ordered pairs 
of elements, rather than to ordered triples, etc. In our work we drop the 
qualifying adjective and speak simply of a relation in X, since we shall 
have occasion to consider only relations of this kind.! 

We now assume that a partition of our non-empty set X is given, 


1 Some writers prefer to regard a relation Rin X as a subset Rof X X X. From 
this point of view, z Ry and z Ry are simply equivalent ways of writing (z,y)e R 
and (z,y)¢R. This definition has the advantage of being more tangible than ours, 
and the disadvantage that few people really think of a relation in this way. 
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and we associate with this partition a relation in X. This relation is 
defined in the following way: we say that x is equivalent to y and write 
this x ~ y (the symbol ~ is pronounced ‘‘wiggle’”), if x and y belong to 
the same partition set. It is obvious that the relation ~ has the follow- 
ing properties: 

(1) x~ a for every x (reflexivity); 

(2) «~y>y~w~e (symmetry); 

(3) z~vyandy~z>2 ~ 2 (transitivity). 

This particular relation in X arose in a special way, in connection with a 
given partition of X, and its properties are immediate consequences of its 
definition. Any relation whatever in X which possesses these three 
properties is called an equivalence relation in X. 

We have just seen that each partition of X has associated with 
it a natural equivalence relation in X. We now reverse the situation 
and show that a given equivalence relation in X determines a natural 
partition of X. 

Let ~ be an equivalence relation in X; that is, assume that it is 
reflexive, symmetric, and transitive in the sense described above. If x 
is an element of X, the subset of X defined by [z] = {y:y ~ 2} is called 
the equivalence set of x. The equivalence set of x is thus the set of all 
elements which are equivalent to x We show that the class of all 
distinct equivalence sets forms a partition of X. By reflexivity, x ¢ [z] 
for each element x in X, so each equivalence set is non-empty and 
their union is X. It remains to be shown that any two equivalence sets 
[21] and [z2] are either disjoint or identical. We prove this by showing 
that if [z:] and [x2] are not disjoint, then they must be identical. Sup- 
pose that [x] and [x2] are not disjoint; that is, suppose that they have 
a common element z. Since z belongs to both equivalence sets, z ~~ 2 
and z~ 22, and by symmetry, 2; ~2z. Let y be any element of [x], 
so that y~a. Since y~ 2; and 2,~2z, transitivity shows that 
y~z. By another application of transitivity, y ~ z and z~ 22 imply 
that y ~ ze, so that yisin [x2]. Since y was chosen arbitrarily in [x1], we 
see by this that [z,] C [x2]. The same reasoning shows that [x2] C [2], and 
from this we conclude (see the last paragraph of Sec. 1) that [v1] = [zz]. 

The above discussion demonstrates that there is no real distinction 
(other than a difference in language) between partitions of a set and 
equivalence relations in the set. If we start with a partition, we get an 
equivalence relation by regarding elements as equivalent if they belong 
to the same partition set, and if we start with an equivalence relation, 
we get a partition by grouping together into subsets all elements which 
are equivalent to one another. We have here a single mathematical 
idea, which we have been considering from two different points of view, 
and the approach we choose in any particular application depends entirely 
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on our own convenience. In practice, it is almost invariably the case that 
we use equivalence relations (which are usually easy to define) to obtain 
partitions (which are sometimes difficult to describe fully). 

We now turn to several of the more important simple examples of 
equivalence relations. 

Let I be the set of all integers. If a and b are elements of this set, 
we write a = 6 (and say that a equals b) if a and b are the same integer. 
Thus 2 + 3 = 5 means that the expressions on the left and right are 
simply different ways of writing the same integer. It is apparent that = 
used in this sense is an equivalence relation in the set J: 

(1) a =a for every a; 

(2) a=b>b=a4; 

(3) a=bandb=c>a=—c. 

Clearly, each equivalence set consists of precisely one integer. 

Another familiar example is the relation of equality commonly used 
for fractions. We remind the reader that, strictly speaking, a fraction 
is merely a symbol of the form a/b, where a and b are integers and b is 
not zero. The fractions 244 and % are obviously not identical, but 
nevertheless we consider them to be equal. In general, we say that 
two fractions a/b and c/d are equal, written a/b = c/d, if ad and be are 
equal as integers in the usual sense (see the above paragraph). We leave 
it to the reader to show that this is an equivalence relation in the set of 
all fractions. An equivalence set of fractions is what we call a rational 
number. Everyday usage ignores the distinction between fractions and 
rational numbers, but it is important to recognize that from the strict 
point of view it is the rational numbers (and not the fractions) which 
form part of the real number system. 

Our final example has a deeper significance, for it provides us with 
the basic tool for our work of the next two sections. 

For the remainder of this section we consider a relation between 
pairs of non-empty sets, and each set mentioned (whether we say so 
explicitly or not) is assumed to be non-empty. If X and Y are two 
sets, we say that X is numerically equivalent to Y if there exists a one-to- 
one correspondence between X and Y, i.e., if there exists a one-to-one 
mapping of X onto Y. This relation is reflexive, since the identity 
mapping 7x: X — X is one-to-one onto; it is symmetric, since if f:X — Y 
is one-to-one onto, then its inverse mapping f—!: Y — X is also one-to-one 
onto; and it is transitive, since if f:X — Y and g:Y — Z are one-to-one 
onto, then gf: X — Z is also one-to-one onto. Numerical equivalence has 
all the properties of an equivalence relation, and if we consider it as an 
equivalence relation in the class of all non-empty subsets of some universal 
set U, it groups together into equivalence sets all those subsets of U 
which have the same number of elements. After we state and prove the 


Sets and Functions 29 


following very useful but rather technical theorem, we shall continue in 
Secs. 6 and 7 with an exploration of the implications of these ideas. 

The theorem we have in mind—the Schroeder-Bernstein theorem—is 
the following: if X and Y are two sets each of which is numerically equivalent 
to a subset of the other, then all of X is numerically equivalent to all of Y. 
There are several proofs of this classic theorem, some of which are quite 
difficult. The very elegant proof we give is essentially due to Birkhoff 
and MacLane. 

Now for the proof. We assume that f:X— Y is a one-to-one 
mapping of X into Y, and that g: Y — X is a one-to-one mapping of Y 
into X. Our task is to produce a mapping F': X — Y which is one-to-one 
onto. We may assume that neither f nor g is onto, since if f is, we can 
define F to be f, and if g is, we can define F to be g-'. Since both f and g 
are one-to-one, it is permissible to use the mappings f—! and g~' as long 
as we clearly understand that f—! is defined only on f(X) and g™ only on 
g(Y). We obtain the mapping F by splitting both X and Y into subsets 
which we characterize in terms of the ancestry of their elements. Let x 
be an element of X. We apply g— to it (if we can) to get the element 
g(x) in Y. If g(x) exists, we call it the first ancestor of x. The ele- 
ment x itself we call the zeroth ancestor of z. We now apply f- to 
g—1(x) if we can, and if (f-!g—!)(x) exists, we call it the second ancestor 
of z. We now apply g™ to (f-!g—!)(z) if we can, and if (g~'f-!g—)(z) 
exists, we call it the third ancestor of x. As we continue this process of 
tracing back the ancestry of x, it becomes apparent that there are three 
possibilities. (1) 2 has infinitely many ancestors. We denote by X; 
the subset of X which consists of all elements with infinitely many 
ancestors. (2) + has an even number of ancestors; this means that z 
has a last ancestor (that is, one which itself has no first ancestor) in X. 
We denote by X, the subset of X consisting of all elements with an even 
number of ancestors. (3) « has an odd number of ancestors; this 
means that z has a last ancestor in Y. We denote by X, the subset of X 
which consists of all elements with an odd number of ancestors. The 
three sets X;, X., X. form a disjoint class whose union is X. We decom- 
pose Y in just the same way into three subsets Y;, Y., Y.. It is easy to 
see that f maps X; onto Y; and X, onto Y,, and that g-! maps X, onto 
Y.; and we complete the proof by defining F in the following piecemeal 
manner: 


_{f@) fre XU X, 
gh ee So ifreX» 


We attempt to illustrate these ideas in Fig. 12. Here we present two 
replicas of the situation: on the left, X and Y are represented by the 
vertical lines, and f and g by the lines slanting down to the right and 
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left; and on the right, we schematically trace the ancestry of three ele- 
ments in X, of which z; has no first ancestor, x2 has a first and second 
ancestor, and 2; has a first, second, and third ancestor. 


»¢ Y 


g(Y) f(X) 


Fig. 12. The proof of the Schroeder-Bernstein theorem. 


The Schroeder-Bernstein theorem has great theoretical and practical 
significance. Its main value for us lies in its role as a tool by means of 
which we can prove numerical equivalence with a minimum of effort for 
many specific sets. We put it to work in Sec. 7. 


Problems 


1, Let f: X — Y be an arbitrary mapping. Define a relation in X as 
follows: 21 ~~ x2 means that f(z) = f(z2). Show that this is an 
equivalence relation and describe the equivalence sets. 

2. In the set R of all real numbers, let  ~ y mean that x — y is an 
integer. Show that this is an equivalence relation and describe the 
equivalence sets. 

3. Let I be the set of all integers, and let m be a fixed positive integer. 
Two integers a and b are said to be congruent modulo m—symbolized 
by a = b (mod m)—if a — b is exactly divisible by m, ie., if a — bis 
an integral multiple of m. Show that this is an equivalence relation, 
describe the equivalence sets, and state the number of distinct 
equivalence sets. 

4. Decide which ones of the three properties of reflexivity, symmetry, 
and transitivity are true for each of the following relations in the set 
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of all positive integers: m <n, m <n, m divides n. Are any of 
these equivalence relations? 

5. Give an example of a relation which is (a) reflexive but not sym- 
metric or transitive; (6) symmetric but not reflexive or transitive; 
(c) transitive but not reflexive or symmetric; (d) reflexive and 
symmetric but not transitive; (e) reflexive and transitive but not 
symmetric; (f) symmetric and transitive but not reflexive. 

6. Let X be a non-empty set and ~ a relation in X. The following 
purports to be a proof of the statement that if this relation is sym- 
metric and transitive, then it is necessarily reflexive: ~~ y > y ~ 2; 
z~y and y~x=>2~17; therefore s~ x for every zx. In view 
of Problem 5f, this cannot be a valid proof. What is the flaw in the 
reasoning? 

7. Let X be a non-empty set. A relation ~ in X is called circular if 
‘a~yandy ~ 2=>2~ 2, and triangular ifz~ yandz~z>y ~2z. 
Prove that a relation in X is an equivalence relation © it is reflexive 
and circular <= it is reflexive and triangular. 


6. COUNTABLE SETS 


The subject of this section and the next—infinite cardinal numbers— 
lies at the very foundation of modern mathematics. It is a vital instru- 
ment in the day-to-day work of many mathematicians, and we shall 
make extensive use of it ourselves. This theory, which was created by 
the German mathematician Cantor, also has great aesthetic appeal, for 
it begins with ideas of extreme simplicity and develops through natural 
stages into an elaborate and beautiful structure of thought. In the 
course of our discussion we shall answer questions which no one before 
Cantor’s time thought to ask, and we shall ask a question which no one 
can answer to this day. 

Without further ado, we can say that cardinal numbers are those used 
in counting, such as the positive integers (or natural numbers) 1, 2, 
3, . . . familiar to us all. But there is much more to the story than this. 

The act of counting is undoubtedly one of the oldest of human 
activities. Men probably learned to count in a crude way at about the 
same time as they began to develop articulate speech. The earliest men 
who lived in communities and domesticated animals must have found 
it necessary to record the number of goats in the village herd by means of 
a pile of stones or some similar device. If the herd was counted in each 
night by removing one stone from the pile for each goat accounted for, 
then stones left over would have indicated strays, and herdsmen would 
have gone out to search for them. Names for numbers and symbols for 
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them, like our 1, 2, 3, . . . , would have been superfluous. The simple 
and yet profound idea of a one-to-one correspondence between the 
stones and the goats would have fully met the needs of the situation. 

In a manner of speaking, we ourselves use the infinite set 


N = {1,2,3,...} 


of all positive integers as a “‘pile of stones.’ We carry this set around 
with us as part of our intellectual equipment. Whenever we want to count 
a set, say, a stack of dollar bills, we start through the set N and tally off 
one bill against each positive integer as we come to it. The last number 
we reach, corresponding to the last bill, is what we call the number of 
bills in the stack. If this last number happens to be 10, then ‘‘10” is 
our symbol for the number of bills in the stack, as it also is for the number 
of our fingers, and for the number of our toes, and for the number of 
elements in any set which can be put into one-to-one correspondence 
with the finite set {1, 2,..., 10}. Our procedure is slightly more 
sophisticated than that of the primitive savage. We have the symbols 
1, 2, 8, . . . for the numbers which arise in counting; we can record 
them for future use, and communicate them to other people, and manipu- 
late them by the operations of arithmetic. But the underlying idea, 
that of the one-to-one correspondence, remains the same for us as it 
probably was for him. 

The positive integers are adequate for the purpose of counting any 
non-empty finite set, and since outside of mathematics all sets appear to 
be of this kind, they suffice for all non-mathematical counting. But in 
the world of mathematics we are obliged to consider many infinite sets, 
such as the set of all positive integers itself, the set of all integers, the 
set of all rational numbers, the set of all real numbers, the set of all 
points in a plane, and so on. It is often important to be able to count 
such sets, and it was Cantor’s idea to do this, and to develop a theory of 
infinite cardinal numbers, by means of one-to-one correspondences. 

In comparing the sizes of two sets, the basic concept is that of 
numerical equivalence as defined in the previous section. We recall 
that two non-empty sets X and Y are said to be numerically equivalent if 
there exists a one-to-one mapping of one onto the other, or—and this 
amounts to the same thing—if there can be found a one-to-one corre- 
spondence between them. To say that two non-empty finite sets are 
numerically equivalent is of course to say that they have the same number 
of elements in the ordinary sense. If we count one of them, we simply 
establish a one-to-one correspondence between its elements and a set of 
positive integers of the form {1, 2, ... , n}, and we then say that n is 
the number of elements possessed by both, or the cardinal number of both. 
The positive integers are the finite cardinal numbers. We encounter 
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many surprises as we follow Cantor and consider numerical equivalence 
for infinite sets. 

The set N = {1, 2, 3, . . .} of all positive integers is obviously 
“larger” than the set {2, 4, 6, . . .} of all even positive integers, for it 
contains this set as a proper subset. It appears on the surface that N has 
“more” elements. But it is very important to avoid jumping to con- 
clusions when dealing with infinite sets, and we must remember that our 
criterion in these matters is whether there exists a one-to-one corre- 
spondence between the sets (not whether one set is or is not a proper 
subset of the other). As a matter of fact, the pairing 


15:25.) 205g Ny & od 
2,4,6,...,2n,. 


serves to establish a one-to-one correspondence between these sets, in 
which each positive integer in the upper row is matched with the even 
positive integer (its double) directly below it, and these two sets must 
therefore be regarded as having the same number of elements. This is a 
very remarkable circumstance, for it seems to contradict our intuition 
and yet is based only on solid common sense. We shall see below, in 
Problems 6 and 7-4, that every infinite set is numerically equivalent to a 
proper subset of itself. Since this property is clearly not possessed by 
any finite set, some writers even use it as the definition of an infinite set. 

In much the same way as above, we can show that MN is numerically 
equivalent to the set of all even integers: 


1,2, 3,4, 5,6, 7... 
0, 2, -2,4, —4,6,-6,... 


Here our device is to start with 0 and follow each even positive integer 
as we come to it by its negative. Similarly, N is numerically equivalent 
to the set of all integers: 


520 “BA, G6, Veen 
1, —1, 2, —2,8, —3,... 


It is of considerable historical interest to note that Galileo observed in the 
early seventeenth century that there are precisely as many perfect 
squares (1, 4, 9, 16, 25, etc.) among the positive integers as there are 
positive integers altogether. This is clear from the pairing 


|e ee ae ae eee 
12, 22, 32, 42,52... 


It struck him as very strange that this should be true, considering how 
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sparsely strewn the squares are among all the positive integers. But 
the time appears not to have been ripe for the exploration of this phenom- 
enon, or perhaps he had other things on his mind; in any case, he did not 
follow up his idea. 

These examples should make it clear that all that is really necessary 
in showing that an infinite set X is numerically equivalent to N is that we 
be able to list the elements of X, with a first, a second, a third, and so on, 
in such a way that it is completely exhausted by this counting off of its 
elements. It is for this reason that any infinite set which is numerically 
equivalent to N is said to be countably infinite. We say that a set is 
countable if it is non-empty and finite (in which case it can obviously be 
counted) or if it is countably infinite. 

One of Cantor’s earliest discoveries in his study of infinite sets was 
that the set of all positive rational numbers (which is very large: it 
contains N and a great many other numbers besides) is actually countable. 
We cannot list the positive rational numbers in order of size, as we can 
the positive integers, beginning with the smallest, then the next smallest, 
and so on, for there is no smallest, and between any two there are infi- 
nitely many others. We must find some other way of counting them, 
and following Cantor, we arrange them not in order of size, but according 
to the size of the sum of the numerator and denominator. We begin 
with all positive rationals whose numerator and denominator add up to 2: 
there is only one, 4% = 1. Next we list (with increasing numerators) all 
those for which this sum is 3:14, 24 = 2. Next, all those for which this 
sum is 4:14, 34 = 1, 3 = 3. Next, all those for which this sum is 

314, 2g, 34,% = 4. Next, all those for which this sum is 6:14, 24 = 14, 
3% = 1, % = 2, 54 = 5. Andsoon. If we now list all these together 
from the beginning, omitting those already listed when we come to them, 
we get-a sequence 


1, 4, 2, 44, 3, 14, %, 34, 4, 46, 5, a tai 


which contains each positive rational number once and only once. 
Figure 13 gives a schematic representation of this manner of listing the 
positive rationals. In this figure the first row contains all positive 
rationals with numerator 1, the second all with numerator 2, etc.; and 
the first column contains all with denominator 1, the second all with 
denominator 2, and so on. Our listing amounts to traversing this array 
of numbers as the arrows indicate, where of course all those numbers 
already encountered are left out. 

It’s high time that we christened the infinite cardinal number we’ ve 
been discussing, and for this purpose we use the first letter of the Hebrew 
alphabet (XN, pronounced “aleph”’) with 0 as a subscript. We say 
that XN, is the number of elements in any countably infinite set. Our 
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complete list of cardinal numbers so far is 
1,2,3,...,No. 


We expand this list in the next section. 

Suppose now that m and n are two cardinal numbers (finite or 
infinite). The statement that m is less than n (written m < n) is defined 
to mean the following: if X and Y are sets with m and n elements, then 


De hah) De Bs ye des 
bP 2 3. 65 
2 2 2 2 2 see 
i ee er See 
3 93 3 3 3 aoe 
rt 2 3-4 5 
Se ee ae a see 
1 2 3 4 § 
Dig (ao DR SOS. ee toes sas 
1 2 3 4 §& 


rd 


Fig. 13. A listing of the positive rationals. 


(1) there exists a one-to-one mapping of X into Y, and (2) there does not 
exist a one-to-one mapping of X onto Y. Using this concept, it is easy to 
relate our cardinal numbers to one another by means of 


1<2<38< +++ <N. 


With respect to the finite cardinal numbers, this ordering corresponds 
to their usual ordering as real numbers. 


Problems 


1. Prove that the set of all rational numbers (positive, negative, and 
zero) is countable. (Hint: see our method of showing that the set of 
all integers is countable.) 

2. Use the idea behind Fig. 13 to prove that if {X;} is a countable class 
of countable sets, then U;X; is also countable. We usually express 
this by saying that any countable union of countable sets is countable. 
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3. Prove that the set of all rational points in the coordinate plane R? 
(i.e., all points whose coordinates are both rational) is countable. 
4. Prove that if X, and X_2 are countable, then X; X X2 is also 


countable, 
5. Prove that if X,, X2, . . . , X, are countable, where 7 is any positive 
integer, then X: XK X2 X --°-* X Xn is also countable. 


6. Prove that every countably infinite set is numerically equivalent to a 
proper subset of itself. 

7. Prove that any non-empty subset of a countable set is countable. 

8. Let X and Y be non-empty sets, and f a mapping of X onto Y. 
If X is countable, prove that Y is also countable. 


7. UNCOUNTABLE SETS 


All the infinite sets we considered in the previous section were 
countable, so it might appear at this stage that every infinite set is counta- 
ble. If this were true, if the end result of the analysis of infinite sets 
were that they are all numerically equivalent to one another, then 
Cantor’s theory would be relatively trivial. But this is not the case, for 
Cantor discovered that the infinite set R of all real numbers is not counta- 
ble—or, as we phrase it, 2 is uncountable or uncountably infinite. Since we 
customarily identify the elements of R with the points of the real line 
(see Sec. 4), this amounts to the assertion that the set of all points on the 
real line represents a “higher type of infinity” than that of only the 
integral points or only the rational points. 

Cantor’s proof of this is very ingenious, but it is actually quite 
simple. In outline the procedure is as follows: we assume that all the 
real numbers (in decimal form) can be listed, and in fact have been 
listed; then we produce a real number which cannot be in this list—thus 
contradicting our initial assumption that a complete listing is possible. 
In representing real numbers by decimals, we use the scheme of decimal 
expansion in which infinite chains of 9’s are avoided; for instance, we 
write }4 as .5000 . . . and notas .4999 .... In this way we guarantee 
that each real number has one and only one decimal representation. 
Suppose now that we can list all the real numbers, and that they have 
been listed in a column like the one below (where we use particular 
numbers for the purpose of illustration). 


ist number 13 + .712983 ... 
2nd number —4+ .913572 ... 
3rd number 0 + .848265 ... 


i 
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Since it is impossible actually to write down this infinite list of decimals, 
our assumption that all the real numbers can be listed in this way means 
that we assume that we have available some general rule according to 
which the list is constructed, similar to that used for listing the positive 
rationals, and that every conceivable real number occurs somewhere 
in this list. We now demonstrate that this assumption is false by exhib- 
iting a decimal .a,a2a3 . . . which is constructed in such a way that it is 
not in the list. We choose a; to be 1 unless the first digit after the 
decimal point of the first number in our list is 1, in which case we choose 
a, to be 2. Clearly, our new decimal will differ from the first number in 
our list regardless of how we choose its remaining digits. Next, we 
choose az to be 1 unless the second digit after the decimal point of the 
second number in our list is 1, in which case we choose a, to be 2. Just 
as above, our new decimal will necessarily differ from the second number 
in our list. We continue building up the decimal .a,a2a3; . . . in this 
way, and since the process can be continued indefinitely, it defines a real 
number in decimal form (.121 . . . in the case of our illustrative example) 
which is different from each number in our list. This contradicts our 
assumption that we can list all the real numbers and completes our proof 
of the fact that the set R of all real numbers is uncountable. 

We have seen (in Problem 6-1) that the set of all rational points on 
the real line is countable, and we have just proved that the set of all 
points on the real line is uncountable. We conclude at once from this 
that irrational points on the real line (i.e., irrational numbers) must 
exist. In fact, it is very easy to see by means of Problem 6-2 that the 
set of all irrational numbers is uncountably infinite. To vary slightly a 
striking metaphor coined by E. T. Bell, the rational numbers are spotted 
along the real line like stars against a black sky, and the dense blackness 
of the background is the firmament of the irrationals. The reader is 
probably familiar with a proof of the fact that the square root of 2 is 
irrational. This proof demonstrates the existence of irrational numbers 
by exhibiting a specimen. Our remarks, on the other hand, do not show 
that this or that particular number is irrational; they merely show that 
such numbers must exist, and moreover must exist in overwhelming 
abundance. 

If the reader supposes that the set of all points on the real line F is 
uncountable because 2 is infinitely long, then we can disillusion him by 
the following argument, which shows that any open interval on R, no 
matter how short it may be, has precisely as many points as F itself. 
Let a and b be any two real numbers with a < b, and consider the open 
interval (a,b). Figure 14 shows how to establish a one-to-one corre- 
spondence between the points P of (a,b) and the points P’ of R: we 
bend (a,b) into a semicircle; we rest this semicircle tangentially on the 
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real line R as shown in the figure; and we link P and P’ by projecting 
from its center. If formulas are preferred over geometric reasoning of 
this kind, we observe that y = a + (6 — a)z is a numerical equivalence 
between real numbers z € (0,1) and y & (a,b), and that 2 = tan x(x — 14) 
is another numerical equivalence between (0,1) and all of R. It now 
follows that (a,b) and R are numerically equivalent to one another. 

We are now in a position to show that any subset X of the real line 
R which contains an open interval J is numerically equivalent to R, no 

matter how complicated the structure 
aN of X may be. The proof of this 
fact is very simple, and it uses only 
the Schroeder-Bernstein theorem and 
our above result that J is numeri- 
cally equivalent to R. The argu- 
ment can be given in two sentences. 
Since X is numerically equivalent to 
itself, it is obviously numerically equivalent to a subset of R; and FR is 
numerically equivalent to a subset of X, namely, to 7. It is now a direct 
consequence of the Schroeder-Bernstein theorem that X and R are 
numerically equivalent to one another. We point out that all numerical 
equivalences up to this point have been established by actually exhibiting 
one-to-one correspondences between the sets concerned. In the present 
situation, however, it is not feasible to do this, for very little has been 
assumed about the specific nature of the set X. Without the help of the 
Schroeder-Bernstein theorem it would be very difficult to prove theorems 
of this type. 

We give another interesting application of the Schroeder-Bernstein 
theorem. Consider the coordinate plane R? and the subset X of R? 
defined by X = {(z,y):0 <x <landO <y <1}. We show that X is 
numerically equivalent to the closed-open interval 


Fig.14. Aone-to-one correspondence 
between an open interval and the real 
line. 


I = {(z,y):0 < 2 < Landy = 0} 


which forms its base (see Fig. 15). Since J is numerically equivalent to a 
subset of X, namely, to J itself, our conclusion will follow at once from 
the Schroeder-Bernstein theorem if we can establish a one-to-one mapping 
of X into J. This we now do. Let (x,y) be an arbitrary point of X. 
Each of the coordinates z and y has a unique decimal expansion which 
does not end in an infinite chain of 9’s. We form another decimal z from 
these by alternating their digits; for example, if 2 = .327 ... and 
y= 614..., then z= 362174 ..... We now identify z (which 
cannot end in an infinite chain of 9’s) with a point of J. This gives the 
required one-to-one mapping of X into I and yields the somewhat 
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startling result that there are no more points inside a square than there 
are on one of its sides. 

In Sec. 6 we introduced the symbol NX> for the number of elements 
in any countably infinite set. At the 
beginning of this section we proved y axis 
that the set R of all real numbers 
(or of all points on the real line) is 
uncountably infinite. We now in- 
troduce the symbol ¢ (called the (0, 1)¢---—-——--—— 
cardinal number of the continuum) H 
for the number of elements in R. y 
c is the cardinal number of # and of 
any set which is numerically equiv- 
alent to R. In the above three 
paragraphs we have demonstrated 
that c is the cardinal number of any 
open interval, of any subset of R Fig. 15 
which contains an open interval, and 
of the subset X of the coordinate plane which is illustrated in Fig. 15. 
Our list of cardinal numbers has now grown to 


1, 2, 3, oe Xo, ¢, 
and they are related to each other by 
1<2<3<--+ <NH<e. 


x axis 


At this point we encounter one of the most famous unsolved problems of 
mathematics. Is there a cardinal number greater than Np and less than 
c? No one knows the answer to this question. Cantor himself thought 
that there is no such number, or in other words, that c is the next infinite 
cardinal number greater than No, and his guess has come to be known as 
Cantor’s continuum hypothesis. The continuum hypothesis can also be 
expressed by the assertion that every uncountable set of real numbers 
has ¢ as its cardinal number.! 

There is another question which arises naturally at this stage, and 
this one we are fortunately able to answer. Are there any infinite 
cardinal numbers greater than c? Yes, there are; for example, the 
cardinal number of the class of all subsets of R. This answer depends on 
the following fact: if X is any non-empty set, then the cardinal number of 
X is less than the cardinal number of the class of all subsets of X. 

We prove this statement as follows. In accordance with the defini- 
tion given in the last paragraph of the previous section, we must show 


1 For further information about the continuum hypothesis, see Wilder [42, p. 125] 
and Gédel [12]. 
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(1) that there exists a one-to-one mapping of X into the class of all its 
subsets, and (2) that there does not exist such a mapping of X onto this 
class. To prove (1), we have only to point to the mapping x— {z}, 
which makes correspond to each element x that set {x} which consists of 
the element x alone. We prove (2) indirectly. Let us assume that 
there does exist a one-to-one mapping f of X onto the class of all its 
subsets. We now deduce a contradiction from the assumed existence of 
such a mapping. Let A be the subset of X defined by A = {x:2#f(zx)}. 
Since our mapping f is onto, there must exist an element ain X such that 
f(a) = A. Where is the element a? Jf ais in A, then by the definition 
of A we have a ¢ f(a), and since f(a) = A,a¢A. Thisisa contradiction, 
so acannot belong to A. Butif ais notin A, then again by the definition 
of A we have aéf(a) or a€ A, which is another contradiction. The 
situation is impossible, so our assumption that such a mapping exists 
must be false. 

This result guarantees that given any cardinal number, there always 
exists a greater one. If we start with a set X, = {1} containing one 
element, then there are two subsets, the empty set 6 and the set {1} 
itself. If X_ = {1,2} is a set containing two elements, then there are 
four subsets: 9, {1}, {2}, {1,2}. If X3 = {1, 2, 3} is a set containing 
three elements, then there are eight subsets: 9, {1}, {2}, {3}, {1,2}, {1,3}, 
{2,3}, {1, 2,3}. In general, if X, is a set with n elements, where n is any 
finite cardinal number, then X, has 2" subsets. If we now take 7 to be 
any infinite cardinal number, the above facts suggest that we define 2” to 
be the number of subsets of any set with n elements. If n is the first 
infinite cardinal number, namely, No, then it can be shown that 


Qk = ¢, 


The simplest proof of this fact depends on the ideas developed in the 
following paragraph. 

Consider the closed-open unit interval [0,1) and a real number z in 
this set. Our concern is with the meaning of the decimal, binary, and 
ternary expansions of x. For the sake of clarity, let us take x to be 14. 
How do we arrive at the decimal expansion of 14? First, we split (0,1) 
into the 10 closed-open intervals 


(0,40), (40,240), 8 ey [%o,1), 


and we use the 10 digits 0, 1, . . . , 9 to number them in order. Our 
number 14 belongs to exactly one of these intervals, namely, to [240,340). 
We have labeled this interval with the digit 2, so 2 is the first digit after 
the decimal point in the decimal expansion of 14: 


a= 2. 
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Next, we split the interval {2{9,3{9) into the 10 closed-open intervals 
{240,7}400), [7}400,22400), - - - » [?%00,340), 


and we use the 10 digits to number these in order. Our number 14 
belongs to [254909,2&{00), which is labeled with the digit 5, so 5 is the 
second number after the decimal point in the decimal expansion of 14: 


i 


If we continue this process exactly as we started it, we can obtain the 
decimal expansion of 44 to as many places as we wish. As a matter of 
fact, if we do continue, we get 0 at each stage from this point on: 


4 = .25000... 


The reader should notice that there is no ambiguity in this system as we 
have explained it: contrary to customary usage, .24999 . . . is not to 
be regarded as another decimal expansion of 14 which is “‘equivalent’”’ to 
.25000 .... In this system, each real number z in (0,1) has one and 
only one decimal expansion which cannot end in an infinite chain of 9’s. 
There is nothing magical about the role of the number 10 in the above 
discussion. If at each stage we split our closed-open interval into two 
equal closed-open intervals, and if we use the two digits 0 and 1 to 
number them, we obtain the binary expansion of any real number z in 
(0,1). The binary expansion of 14 is easily seen to be .01000.... 
The ternary expansion of z is found similarly: at each stage we split our 
closed-open interval into three equal closed-open intervals, and we use 
the three digits 0, 1, and 2 to number them. A moment’s thought should 
convince the reader that the ternary expansion of 14 is .020202.... 
Just as (in our system) the decimal expansion of a number in [0,1) cannot 
end in an infinite chain of 9’s, so also its binary expansion cannot end in 
an infinite chain of 1’s, and its ternary expansion cannot end in an 
infinite chain of 2’s. 
We now use this machinery to give a proof of the fact that 


Qe — ¢, 


Consider the two sets N = {1, 2, 3, . . .} and J = (0,1), the first with 
cardinal number No and the second with cardinal number c. IfN denotes 
the class of all subsets of N, then by definition N has cardinal number 
2** Our proof amounts to showing that there exists a one-to-one corre- 
spondence between N and J. We begin by establishing a one-to-one 
mapping f of Ninto 7. If A is a subset of N, then f(A) is that real num- 
ber x in I whose decimal expansion z+ = .dided; . . . is defined by the 
condition that d, is 3 or 5 according asnisorisnotin A. Any other two 
digits can be used here, as long as neither of them is 9. Next, we con- 
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struct a one-to-one mapping g of JintoN. If xis areal number in J, and 
if x = .b:beb3 . . . is its binary expansion (so that each b, is either 0 or 1), 
then g(x) is that subset A of N defined by A = {n:b, = 1}. We con- 
clude the proof with an appeal to the Schroeder-Bernstein theorem, which 
guarantees that under these conditions N and J are numerically equivalent 
to one another. 

If we follow up the hint contained in the fact that 2¥° = c, and 
successively form 2°, 27, and so on, we get a chain of cardinal numbers 


1<2<8<°-++ CNR<c<Vw<cWe::- 


in which there are infinitely many infinite cardinal numbers. Clearly, 
there is only one kind of countable infinity, symbolized by No, and 
beyond this there is an infinite hierarchy of uncountable infinities which 
are all distinct from one another. 

At this point we bring our discussion of these matters to a close. 
We have barely touched on Cantor’s theory and have left entirely to 
one side, for instance, all questions relating to the addition and multi- 
plication of infinite cardinal numbers and the rules of arithmetic which 
apply to these operations. We have developed these ideas, not for 
their own sake, but for the sake of their applications in algebra and 
topology, and our main purpose throughout the last two sections has 
been to give the reader some of the necessary insight into countable and 
uncountable sets and the distinction between them.! 


Problems 


1. Show geometrically that the set of all points in the coordinate plane 
R? is numerically equivalent to the subset X of R? illustrated in 
Fig. 15 and defined by X = {(z,y):0 <2 <1and0< y < 1}, and 
that therefore R? has cardinal number c. [Hint: rest an open 
hemispherical surface (= a hemispherical surface minus its boundary) 
tangentially on the center of X, project from various points on the 
line through its center and perpendicular to R?, and use the Schroeder- 
Bernstein theorem.] 

2. Show that the subset X of R? defined by 


X = {(21, 2, 23):0 < x; < 1 for? = 1, 2, 3} 


has cardinal number c. 


1¥For the reader who wishes to learn something about the arithmetic of infinite 
cardinal numbers, we recommend Halmos [16, sec. 24], Kamke [24, chap. 2], Sier- 
pinski [37, chaps. 7-10], or Fraenkel (9, chap. 2]. 
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3. Let n be a positive integer and consider a polynomial equation of the 
form 


Ant" + Anit™* + +++ +a =0, 


with integral coefficients and a, ~ 0. Such an equation has pre- 
cisely n complex roots (some of which, of course, may be real). An 
algebraic number is a complex number which is a root of such an 
equation. The set of all algebraic numbers contains the set of all 
rational numbers (e.g., 24 is the root of 32 — 2 = 0) and many 
other numbers besides (the square root of 2 is a root of x? — 2 = 0, 
and 1 + 7 is a root of x? — 2x + 2 = 0). Complex numbers which 
are not algebraic are called transcendental. The numbers e and = are 
the best known transcendental numbers, though the fact that they 
are transcendental is quite difficult to prove (see Niven [33, chap. 9]). 
Prove that real transcendental numbers exist (hint: see Problem 6-5). 
Prove also that the set of all real transcendental numbers is uncount- 
ably infinite. 

4. Prove that every infinite set is numerically equivalent to a proper 
subset of itself (hint: see Problem 6-6). 

5. Prove that the set of all real functions defined on the closed unit 
interval has cardinal number 2°. [Hint: there are at least as many 
such functions as there are characteristic functions (i.e., functions 
whose values are 0 or 1) defined on the closed unit interval.] 


8. PARTIALLY ORDERED SETS AND LATTICES 


There are two types of relations which often arise in mathematics: 
order relations and equivalence relations. We touched briefly on 
order relations in Problem 1-2, and in Section 5 we discussed equivalence 
relations in some detail. We now return to the topic of order relations 
and develop those parts of this subject which are necessary for our later 
work, The reader will find it helpful to keep in mind that a partial order 
relation (as we define it below) is a gencralization of both set inclusion 
and the order relation on the real line. 

Let P be a non-empty set. A partial order relation in P is a relation 
which is symbolized by < and assumed to have the following properties: 

(1) « < =x for every x (reflexivity); 

(2) x<yandy <2z>52 = y (antisymmetry); 

(3) t<yandy <z=>5 2 <z (transitivity). 

We sometimes write z < y in the equivalent form y > zx. A non-empty 
set P in which there is defined a partial order relation is called a partially 
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ordered set. It is clear that any non-empty subset of a partially ordered 
set is a partially ordered set in its own right. 

Partially ordered sets are abundant in all branches of mathematics. 
Some are simple and easy to grasp, while others are complex and rather 
inaccessible. We give four examples which are quite different in nature 
but possess in common the virtues of being both important and easily 
described. 


Example 1. Let P be the set of all positive integers, and let m <n 
mean that m divides n. 


Example 2. Let P be the set 2 of all real numbers, and let + < y have 
its usual meaning (see Problem 1-2). 


Example 3. Let P be the class of all subsets of some universal set U, 
and let A < B mean that A is a subset of B. 


Example 4. Let P be the set of all real functions defined on a non- 
empty set X, and let f < g mean that f(x) < g(x) for every z. 


Two elements z and y in a partially ordered set are called comparable 
if one of them is less than or equal to the other, that is, if either « < y or 
y <x. The word “partially” in the phrase “partially ordered set’’ is 
intended to emphasize that there may be pairs of elements in the set 
which are not comparable. In Example 1, for instance, the integers 
4 and 6 are not comparable, because neither divides the other; and in 
Example 3, if the universal set U has more than one element, it is always 
possible to find two subsets of U neither of which is a subset of the other. 

Some partial order relations possess a fourth property in addition to 
the three required by the definition: 

(4) any two elements are comparable. 

A partial order relation with property (4) is called a total (or linear) 
order relation, and a partially ordered set whose relation satisfies condition 
(4) is called a totally ordered set, or a linearly ordered set, or, most fre- 
quently, a chain. Example 2 is a chain, as is the subset {2, 4,8, ..., 
2", . . .} of Example 1. 

Let P be a partially ordered set. An element x in P is said to be 
maximal if y>2=>y = 2, that is, if no element other than z itself is 
greater than or equal toz. A maximal element in P is thus an element of 
P which is not less than or equal to any other element of P. Examples 
1, 2, and 4 have no maximal elements. Example 3 has a single maximal 
element: the set U itself. 

Let A be a non-empty subset of a partially ordered set P. An 
element x in P is called a lower bound of A if x < a foreach aeé A; anda 
lower bound of A is called a greatest lower bound of A if it is greater than or 
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equal to every lower bound of A. Similarly, an element y in P is said 
to be an upper bound of A if a < y for every a¢ A; and a least upper 
bound of A is an upper bound of A which is less than or equal to every 
upper bound of A. In general, A may have many lower bounds and 
many upper bounds, but it is easy to prove (see Problem 1) that a 
greatest lower bound (or least upper bound) is unique if it exists. It is 
therefore Jegitimate to speak of the greatest lower bound and the least 
upper bound if they exist. 

We illustrate these concepts in some of the partially ordered sets 
mentioned above. 

In Example 1, let the subset A consist of the integers 4 and 6. An 
upper bound of {4,6} is any positive integer divisible by both 4 and 6. 
12, 24, 36, and so on, are all upper bounds of {4,6}. 12 is clearly its 
least upper bound, for it is less than or equal to (i.e., it divides) every 
upper bound. The greatest lower bound of any pair of integers in this 
example is their greatest common divisor, and their least upper bound is 
their least common multiple—both of which are familiar notions from 
elementary arithmetic. 

We now consider Example 2, the real line with its natural order 
relation. The reader will doubtless recall from his study of calculus that 
3 is an upper bound of the set {(1 + 1/n)":n = 1, 2, 3, . . .} and that 
its least upper bound is the fundamental constant e = 2.7182... . 
As we have stated before, it is a basic property of the real line that every 
non-empty subset of it which has a lower bound (or upper bound) has a 
greatest lower bound (or least upper bound). ‘There are several items of 
standard notation and terminology which must be mentioned in connec- 
tion with this example. Let A be any non-empty set of real numbers. 
If A has a lower bound, then its greatest lower bound is usually called 
its infimum and denoted by inf A. Correspondingly, if A has an upper 
bound, then its least upper bound is called its supremum and written 
sup A. If A happens to be finite, then inf A and sup A both exist and 
belong to A. In this case, they are often called the minimum and 
maximum of A and are denoted by min A and max A. If A consists of 
two real numbers a; and ae, then min A is the smaller of a; and a2, and 
max A is the larger. 

Finally, consider Example 3, and let A be any non-empty class of 
subsets of U. A lower bound of A is any subset of U which is contained 
in every set in A, and the greatest lower bound of A is the intersection 
of all its sets. Similarly, the least upper bound of A is the union of 
all its sets. 

One of our main aims in this section is to state Zorn’s lemma, an 
exceedingly powerful tool of proof which is almost indispensable in 
many parts of modern pure mathematics. Zorn’s lemma asserts that 
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if P is a partially ordered set in which every chain has an upper bound, then 
P possesses a maximal element. It is not possible to prove this in the 
usual sense of the word. However, it can be shown that Zorn’s lemma is 
logically equivalent to the axiom of choice, which states: given any 
non-empty class of disjoint non-empty sets, a set can be formed which 
contains precisely one element taken from each set in the given class. 
The axiom of choice may strike the reader as being intuitively obvious, 
and in fact, either this axiom itself or some other principle equivalent to 


Fig. 16. The geometric meaning of f Ag and f Vg. 


it is usually postulated in the logic with which we operate. We therefore 
assume Zorn’s lemma as an axiom of logic. Any reader who is inter- 
ested in these matters is urged to explore them further in the literature. 

A lattice is a partially ordered set L in which each pair of elements has 
a greatest lower bound and a least upper bound. If z and y are two 
elements in L, we denote their greatest lower bound and least upper 
bound by zay and zvy. These notations are analogous to (and are 
intended to suggest) the notations for the intersection and union of two 
sets. We pursue this analogy even further, and call xa y and zv y the 
meet and join of x and y. It is tempting to assume that all properties of 
intersections and unions in the algebra of sets carry over to lattices, but 
this is not a valid assumption. Some properties do carry over (see 
Problem 5), but others, for instance the distributive laws, are false in 
some lattices. 

It is easy to see that all four of our examples are lattices. In 
Example 1, m a nis the greatest common divisor of m and n, and m v nis 
their least common multiple; and in Example 3, Aa B= AfC\B and 
AvB=AWB. In Example 2, if z and y are any two real numbers, 
then xay is min {z,y} and xvy is max {x,y}. In Example 4, fag is 

1See, for example, Wilder [42, pp. 129-132], Halmos [16, secs. 15-16], Birkhoff 
[4, p. 42], Sierpinski (37, chap. 6], or Fraenkel and Bar-Hillel [10, p. 44]. 
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the real function defined on X by (fag)(x) = min {f(z),g(x)}, and 
fvg is that defined by (fv g)(z) = max {f(x),g(x)}. Figure 16 illus- 
trates the geometric meaning of fag and fvg for two real functions 
f and g defined on the closed unit interval [0,1]. 

Let L be a lattice. A sublattice of L is a non-empty subset L, of L 
with the property that if x and y are in Z,, then xa y and xv y are also 
in L,. If L is the lattice of all real functions defined on the closed unit 
interval, and if LZ, is the set of all continuous functions in L, then LZ, is 
easily seen to be a sublattice of L. 

If a lattice has the additional property that every non-empty subset 
has a greatest lower bound and a least upper bound, then it is called a 
complete lattice. Example 3 is the only complete lattice in our list. 

There are many distinct types of lattices, and the theory of these 
systems has a wide variety of interesting and significant applications 
(see Birkhoff [4]). We discuss some of these types in our Appendix on 
Boolean algebras. 


Problems 


1, Let A be a non-empty subset of a partially ordered set P. Show 
that A has at most one greatest lower bound and at most one least 
upper bound. 

2. Consider the set {1, 2,3, 4,5}. What elements are maximal if it is 

ordered as Example 1? If it is ordered as Example 2? 

Under what circumstances is Example 4 a chain? 

Give an example of a partially ordered set which is not a lattice. 

Let L be a lattice. If x, y, and z are elements of L, verify the 

following: rAx% =2%, 2V% = 4, LAY =YALLVY =YVE, 


SEES 


za (yaz) = (tay) rz, 


xv (yvz) = (avy)vz, (way)vac=az, (4vy)ax =z. 

6. Let A be a class of subsets of some non-empty universal set U. We 
say that A has the finite intersection property if every finite subclass of 
A has non-empty intersection. Use Zorn’s lemma to prove that 
if A has the finite intersection property, then it is contained in some 
maximal class B with this property (to say that B is a maximal class 
with this property is to say that any class which properly contains B 
fails to have this property). (Hint: consider the family of all 
classes which contain A and have the finite intersection property, 
order this family by class inclusion, and show that any chain in the 
family has an upper bound in the family.) 

7. Prove that if X and Y are any two non-empty sets, then there exists 
a one-to-one mapping of one into the other. (Hint: choose an 
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element z in X and an element y in Y, and establish the obvious 
one-to-one correspondence between the two single-element sets 
{x} and {y}; define an extension to be a pair of subsets A of X and 
B of Y such that {x} C A and {y} C B, together with a one-to-one 
correspondence between them under which z and y correspond with 
one another; order the set of all extensions in the natural way; and 
apply Zorn’s lemma.) 

Let. m and n be any two cardinal numbers (finite or infinite). The 
statement that m is less than or equal to n (written m < n) is defined 
to mean the following: if X and Y are sets with m and n elements, 
then there exists a one-to-one mapping of X into Y. Prove that 
any non-empty set of cardinal numbers forms a chain when it is 
ordered in this way. The fact that for any two cardinal numbers 
one is less than or equal to the other is usually called the compara- 
bility theorem for cardinal numbers. 

Let X and Y be non-empty sets, and show that the cardinal number 
of X is less than or equal to the cardinal number of Y = there exists 
a mapping of Y onto X. 

Let {X;,} be any infinite class of countable sets indexed by the ele- 
ments 7 of an index set J, and show that the cardinal number of 
UX; is less than or equal to the cardinal number of J. (Hint: if I 
is only countably infinite, this follows from Problem 6-2, and if J is 
uncountable, Zorn’s lemma can be applied to represent it as the 
union of a disjoint class of countably infinite subsets.) 


CHAPTER TWO 


Metric Spaces 


Classical analysis can be described as that part of mathematics 
which begins with calculus and, in essentially the same spirit, develops 
similar subject matter much further in many directions. It is a great 
nation in the world of mathematics, with many provinces, a few of which 
are ordinary and partial differential equations, infinite series (especially 
power series and Fourier series), and analytic functions of a complex 
variable. Each of these has experienced enormous growth over a long 
history, and each is rich enough in content to merit a lifetime of study. 

In the course of its development, classical analysis became so complex 
and varied that even an expert could find his way around in it only with 
difficulty. Under these circumstances, some mathematicians became 
interested in trying to uncover the fundamental principles on which all 
analysis rests. This movement had associated with it many of the great 
names in mathematics of the last century: Riemann, Weierstrass, Cantor, 
Lebesgue, Hilbert, Riesz, and others. It played a large part in the rise to 
prominence of topology, modern algebra, and the theory of measure and 
integration; and when these new ideas began to percolate back through 
classical analysis, the brew which resulted was moderh analysis. 

As modern analysis developed in the hands of its creators, many a 
major theorem was given a simpler proof in a more general setting, in an 
effort to lay bare its inner meaning. Much thought was devoted to 
analyzing the texture of the real and complex number systems, which are 
the context of analysis. Itwas hoped—and these hopes were well founded 
—that analysis could be clarified and simplified, and that stripping away 
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superfluous underbrush would give new emphasis to what really mattered 
from the point of view of the underlying theory.! 

Analysis is primarily concerned with limit processes and con- 
tinuity, so it is not surprising that mathematicians thinking along these 
lines soon found themselves studying (and generalizing) two elementary 
concepts: that of a convergent sequence of real or complex numbers, and 
that of a continuous function of a real or complex variable. 

We remind the reader of the definitions. First, a sequence 


an ee, ee ee 


of real numbers is said to be convergent if there exists a real number zx 
(called the limit of the sequence) such that, given ¢ > 0, a positive 
integer no can be found with the property that 


n> n= |tn — 2] <e. 


This condition means that zx, must be “close” to x for all “sufficiently 
large’ n, and it is usually symbolized by 


In—>2 or lim zn = 2 


and expressed by saying that z, approaches x or x, converges to x. Sec- 
ond, a real function f defined on a non-empty subset X of the real line is 
said to be continuous at x» in X if for each « > O there exists 6 > 0 such 
that 


xin X and |x — 2o| < 6= |f(z) — f(a)| <6, 


and f is said to be continuous if it is continuous at each point of X. 
When X is an interval, this definition gives precise expression to the 
intuitive requirement that f have a graph without breaks or gaps. The 
corresponding definitions for sequences of complex numbers and complex 
functions of a complex variable are word for word the same. 

Our purpose in giving these definitions in detail here is a simple one. 
We wish to point out explicitly that each is dependent for its meaning 
on the concept of the absolute value of the difference between two real or 
complex numbers. We wish to observe also that this absolute value is 
the distance between the numbers when they are regarded as points on the 
real line or in the complex plane. 

In many branches of mathematics—in geometry as well as analysis— 
it has been found extremely convenient to have available a notion of 
distance which is applicable to the elements of abstract sets. A metric 
space (as we define it below) is nothing more than a non-empty set 

1 We illustrate these points in Appendix 1, where one of the basic existence 


theorems in the theory of differential equations is given a brief and uncluttered proof 
which depends only on the ideas of this chapter. 


Metric Spaces 51 


equipped with a concept of distance which is suitable for the treatment 
of convergent sequences in the set and continuous functions defined on 
the set. Our purpose in this chapter is to develop in a systematic manner 
the main elementary facts about metric spaces. These facts are impor- 
tant for their own sake, and also for the sake of the motivation they 
provide for our later work on topological spaces. 


9. THE DEFINITION AND SOME EXAMPLES 


Let X be a non-empty set. <A metric on X is a real function d of 
ordered pairs of elements of X which satisfies the following three 
conditions: 

(1) d(z,y) > 0, and d(z,y) = O02 = y; 

(2) d(x,y) = d(y,x) (symmetry); 

(3) d(z,y) < d(z,z) + d(z,y) (the triangle inequality). 

The function d assigns to each pair (z,y) of elements of X a non-negative 
real number d(z,y), which by symmetry does not depend on the order 
of the elements; d(z,y) is called the distance between x and y. <A metric 
space consists of two objects: a non-empty set X and a metric d on X. 
The elements of X are called the points of the metric space (X,d). When- 
ever it can be done without causing confusion, we denote the metric 
space (X,d) by the symbol X which is used for the underlying set of 
points. One should always keep in mind, however, that a metric space 
is not merely a non-empty set: it is a non-empty set together with a 
metric. It often happens that several different metrics can be defined 
on a single given non-empty set, and in this case distinct metrics make the 
set into distinct metric spaces. 

There are many different kinds of metric spaces, some of which 
play very significant roles in geometry and analysis. Our first example 
is rather trivial, but it is often useful in showing that certain statements 
we might wish to make are not true. It also shows that every non- 
empty set can be regarded as a metric space. 


Example 1. Let X be an arbitrary non-empty set, and define d by 


_ f0 ife=y, 
d(z,y) = ifx ¥ y. 


The reader can easily see for himself that this definition yields a metric 
on X. 


Our next two examples are the fundamental number systems of 
mathematics, 
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Example 2. Consider the real line & and the real function |z| defined 
on R. Three elementary properties of this absolute value function are 
important for our purposes: 


(i) |z| > 0, and |z| = Oz = 0; 
Gi) |-2| = |2l; 
(iii) |e + yl < |x| + lyl. 


We now define a metric on R by 
d(z,y) = |x — yl. 


This is called the wsual metric on R, and the real line, as a metric space, is 
always understood to have this as its metric. The fact that d actually 
is a metric follows from the three properties stated above. This is a 
piece of reasoning which occurs frequently in our work, so we give the 
details. By (i), d(z,y) = |z — y| is a non-negative real number which 
equalsO@2—-y=0@2=y. By (ii), 


d(z,y) = |x — yl = |-(y — 2)| = ly — 2] = dly,2). 
And by (iii), 


d(z,y) = |x —y| = |@-—2)+ @-yl<le-2at+ke-yl 
= d(z,z) + d(z,y). 


Example 3. Consider the complex plane C. We mentioned C briefly in 
Sec. 4, and we described the sense in which it can be identified as a set with 
the coordinate plane R2. We nowgive a somewhat fuller discussion. If z 
is a complex number, and if z = a + 7b where a and 6 are real numbers, 
then a and b are called the real part and the imaginary part of z and are 
denoted by R(z) and Z(z). Two complex numbers are said to be equal 
if their real and imaginary parts are equal: 


a+ib=c+idea=cand b= d. 


We add (or subtract) two complex numbers by adding (or subtracting) 
their real and imaginary parts, and we multiply them by multiplying 
them out as in elementary algebra and replacing 7? by —1 wherever it 
appears: 


(a + 2) + (C+ id) = (@+c) +204 a), 


and (a + ib)(c + td) = ac + iad + ibe + bd 
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Division is carried out in accordance with 


e+tid (e+id)(c—id)  c+d? 
_atbd , ,be — ad 
erat ata 


where c? + d? is required to be non-zero. If z = a + tb is a complex 
number, then its negative —z and its conjugate Z are defined by 


~z = (—a) + i(—0) 
and = a+ %(—b), which are usually written more informally as 
—z= —a-— ibandZ=a-— ib. It is easy to see that 


z+2 
2 


z—-2 


RG) = 20 


and I(z) = 


The real line 2 is usually regarded as part of the complex plane: 
R = {z:I(z) = 0} = {2:2 = 2}. 
Simple calculations show directly that 
ate=ate, mee ae, and z=2, 


The origin, or zero, is the complex number 0 = 0+ 70. The ordinary 
distance from z = a + 1b to the origin is defined by 


lel = (a? + b2)¥. 
|z| is called the absolute value of z, and it is easy to see that 
|2| = lel and j2|? = 22. 
The usual metric on C is defined by 
d(z1,22) = |21 — 2]. 


Exactly as in Example 2, the fact that this is a metric is a consequence 
of the following properties of the real function |z]: 


(i) |z] > 0, and |z] = O@z = 0; 
Gi) |—z| = lel; 
(iti) Jzx + 2e| < Jes| + |zel. 


Properties (i) and (ii) are obvious. Since —z = (—1)z, property (ii) 
is also a special case of the fact that 


lzize| = |zs| |zel, 
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which we prove by means of 
Jexze|? = z12ez129 = 21212022 = Jz1|?|z2]? = (\z1| \ze[)?. 


If we use the fact that |R(z)| < |z| for any z, property (iii) follows directly 
from 
lz1 + 2al? = (21 + 22) (21 + ze) 


(a. + 22).(21 + 22) 

esti + zeza + 2122 + 2122 
lea|? + |ze|? + (ziz2 + 2122) 
= lel? + |zel? + 2R (e122) 

< Je:|? + \ze|? + 2l2129 

= |zi|* + lze|? + 2lerl |zal 

= |zi|? + lze|? +- 2lzi| feol 

= (zi| + lzel)?. 


Whenever the complex plane C is mentioned as a metric space, its 
metric is always assumed to be the usual metric defined above. 


The remaining examples to be given in this section fit a common 
pattern, which we have tried to exhibit in our discussion of Examples 2 
and 3. We now point out several major features of this pattern, so that 
the reader can see clearly how it applies in the slightly more complicated 
examples that follow. 

I. The elements of each space can be added and subtracted in a 
natural way, and every element has a negative. Each space 
contains a special element, denoted by 0 and called the origin, 
or zero element. 

II. In each space there is defined a notion of the distance from an 
arbitrary element to the origin, that is, a notion of the “‘size” 
of an arbitrary element. The size of an element zx is a real 
number denoted below by ||z|| and called its norm. Our use 
of the double vertical bars is intended to emphasize that the 
norm is a generalization of the absolute value functions in 
Examples 2 and 3, in the sense that it satisfies the following 
three conditions: (i) |lz|| > 0, and |lz||=Oo2=0; (ii) 
—2[| = [lxll; Gii) lz + yll < lel] + Mlyll. 

III. Finally, each metric arises as the norm of the difference between 
two elements: d(z,y) = ||z — y||. As in Example 2, the fact 
that this is a metric follows from the properties of the norm 
listed in II. This metric is called the metric induced by the 
norm, 

The knowledgeable reader will see at once that we are describing here 
(though incompletely and imprecisely) the concept of a normed linear 
space. Most of the metric spaces of major importance in analysis are 


of this type. 
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Example 4. Let f be a real function defined on the closed unit-interval 
[0,1]. We say that f is a bounded function if there is a real number K 
such that |f(x)| < K for every x¢€[0,1]. This concept is familiar to 
the reader from elementary analysis, as is that of the continuity of f as 
defined in the introduction to this chapter. The underlying set of points 
in this example is the set of all bounded continuous real functions defined 
on the closed unit interval. Actually, the boundedness of such a function 
is a consequence of its other properties, but at this stage we assume it 
explicitly. If f and g are two such functions, we add and subtract 
them, and form negatives, pointwise: 


(f + 9)(x) = f(z) + g(a); 
(f — 9)(z) = f(z) — g(x); 
(—f)(z) = —f(z). 


The origin (denoted by 0) is the constant function which is identically 
zero: 


O(z) = 0 
for all x € [0,1]. We define the norm of a function f by 
1 
lf = fy Ise) de, 


and the induced metric by 


df) = If — all = fy \f@) — o@)l ax. 


The integral involved in this definition is the Riemann integral of ele- 
mentary calculus. Properties (i) and (ii) of the norm are easy to prove, 
and (iii) follows from 


lf tall = fo \9@ + o@) de < f. (4@)| + lo) az 


= fC \@laz + fy lo@| dx 
= \Isll + Nall 


Example 5. The set of points in the preceding example—that is, the 
set of all bounded continuous real functions defined on the closed unit 
interval—has another metric which is far more important for our pur- 
poses. It is defined by means of 


fll = sup {|f()]:2 ¢ [0,1]}, 


which we usually write more briefly as 


fll = sup |f(a)l, 
and d(f,g) = If — gil = sup |f(z) — g(z)|. 
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Properties (i) and (ii) of the norm are obvious, and in Problem 5 we ask 
the reader to prove (iii) in a slightly more general form. This example 
is typical of a large class of metric spaces which will play a major role 
in all our work throughout the rest of this book. We denote this space 
by e[0,1]. 


So much for the present for specific examples. We now turn to 
several fundamental principles relating to metric spaces in general. 

Let X be a metric space with metric d. Let Y be an arbitrary non- 
empty subset of X. If the function d is considered to be defined only 
for points in Y, then (Y,d) is evidently itself a metric space. Y, with 
d restricted in this way, is called a subspace of X. This technique of 
forming subspaces of a given metric space enables us to obtain an infinity 
of further examples from the handful described above. For instance, 
the closed unit interval [0,1] is a subspace of the real line, as is the set 
consisting of all the rational points; and the unit circle, the closed unit 
disc, and the open unit disc are subspaces of the complex plane. Also, 
the real line itself is a subspace of the complex plane. 

It is desirable at this stage to introduce the extended real number 
system, by which we mean the ordinary real number system R with 
the symbols 

— © and + 


adjoined. An extended real number is thus a real number or one of these 
symbols. We say (by definition) that 


-a2< +o; 
also, if x is any real number, then 
-—-a <4<c+4+o, 


The symbols —o and +o add nothing to our understanding of the 
real numbers. They are used mainly as a notational convenience, as we 
see below. 

Let A be a non-empty set of real numbers which has an upper 
bound. In Sec. 8 we defined what is meant by the least upper bound 
(or supremum) of A: sup A is the smallest upper bound of A, that is, 
it is the smallest real number y such that a < yforeveryain A. With 
the stated assumptions about A, sup A always exists and is a real number. 
If A is a non-empty set of real numbers which has no upper bound, and 
therefore no least upper bound in R, we express this by writing 


sup A = +0; 
and if A is the empty subset of R, we put 


sup A = —o, 
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The greatest lower bound (or infimum) of A is defined similarly: if A 
is non-empty and has a lower bound, inf A is the largest real number z 
such that + < a for every a in A; if A is non-empty and has no lower 
bound, we put 

inf A = —o; 


and if A is empty, we put 
inf A = +o. 


These remarks illustrate one advantage of the extended real number 
system: it enables us to speak of sup A and inf A for subsets A of the 
real line without any restrictions whatever on the nature of A. 

Another advantage of having available the symbols — © and + 
is that they make convenient a reasonable extension of our concept of 
an interval on the real line. The reader should refer to the definitions 
given in Sec. 1 of the various kinds of intervals, for these are the defini- 
tions whose scope we are now widening. Let a and b be any two real 
numbers such that a < b; then the closed interval from a to 6 is the 
subset of the real line R defined by 


[a,b] = {z:a <x <d}. 


This extends our previous notion in that a closed interval may now 
consist of a single point (if a = 6). If b is a real number and a is an 
extended real number such that a < 6, then the open-closed interval 
from a to 6 is 


(a,b] = {x:a <a <b}. 


This allows open-closed intervals of the form (—,b]. If a is a real 
number and b is an extended real number such that a < b, then the 
closed-open interval from a to b is 


{a,b) = {z:a < a < BD}. 


This permits [a,-+ ©) to be considered a closed-open interval. If a and b 
are extended real numbers such that a < b, then the open interval from 
a to bis 

(a,b) = {x:a <a < bd}. 


This adds to the previously defined open intervals those of the form 
(—©,b) where 6 is real, (a,+.©) where a is real, and (—~%,+~). 
Throughout the rest of this book, the term interval will always signify 
one of the four types defined in this paragraph. The extended real 
numbers a and b are called the end-points of these intervals. We have 
used the symbols — » and + © with considerable freedom, and it there- 
fore seems desirable to emphasize that an interval in our present sense 
is always a non-empty subset of the real number system: it never actually 
contains either of these symbols. 
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The very definition of a metric space presents us with the concept of 
the distance from one point to another. We now define the distance 
from a point to a set and the diameter of a set. 

Let X be a metric space with metric d, and let A be a subset of X. 
If z is a point of X, then the distance from x to A is defined by 


d(zx,A) = inf {d(z,a):a¢ A}; 


that is, it is the greatest lower bound of the distances from z to the 
points of A. The diameter of the set A is defined by 


d{A) = sup {d(ai,a2):a; and a, € A}. 


The diameter of A is thus the least upper bound of the distances between 
pairs of its points. A is said to have finite diameter or infinite diameter 
according as d(A) is a real number or +. We observe that the 
empty set has infinite diameter, since d(#) = —«. A bounded set is 
one whose diameter is finite. A mapping of a non-empty set into a 
metric space is called a bounded mapping if its range is a bounded set. 
Several of the simpler facts about these concepts are brought out in the 
following problems. 


Problems 


1. Let X be a metric space with metric d. Show that dj, defined by 
di(x,y) = d(z,y)/{t + d(z,y)], is also a metric on X. Observe that 
X itself is a bounded set in the metric space (X,d,). 

2. Let X be a non-empty set, and let d be a real function of ordered 
pairs of elements of X which satisfies the following two conditions: 
d(zy) =Oex=y, and d(z,y) < d(z,z) + d(y,z). Show that d 
is a metric on X. 

3. Let X be a non-empty set, and let d be a real function of ordered 
pairs of elements of X which satisfies the following three conditions: 
d(z,y) 2 0, and x = y=d(z,y) = 0; d(x,y) = d(y,x); and d(z,y) < 
d(z,z) + d(z,y). A function d with these properties is called a 
pseudo-metric on X. A metric is obviously a pseudo-metric. Give 
an example of a pseudo-metric which is not a metric. Let d be a 
pseudo-metric on X, define a relation ~ in X by means of 


zr~vyed(zy) = 0, 


and show that this is an equivalence relation whose corresponding 
class of equivalence sets can be made into a metric space in a natural 
way. 

4, Let Xi, Xo, ... , Xn be a finite class of metric spaces with metrics 
d;, dx, ...,4,. Show that each of the functions d and d defined 


Metric Spaces 59 


as follows is a metric on the product X: X X2*-+°: X X,: 
A( {as}, {ys}) = max di(xyys); A({zi},fys}) = Zhi diay). 

5. Let X be a non-empty set and f a real function defined on X. Show 
that f is bounded in the sense of the definition given in the last, 
paragraph of the text = there exists a real number K such that 
[f(z)| < K for every x¢ X = sup |f(z)| < +. Consider the set 
of all bounded real functions defined on X, and define the norm of a 
function f in this set by 


IIfll = sup |f()|. 


It is obvious that ||f|| is a non-negative real number such that 
fll =O<f=0, and that ||—/f\| = ||f]. Prove in detail that 
If + ll < Wl + lig. 

6. Let J be a subset of the real line. Show that J is an interval © it is 
non-empty and contains each point between any two of its points 
(in the sense that if z and z are in J and xz < y < z, then y isin J). 
If {J,} is a non-empty class of intervals on the real line such that 
OI; is non-empty, show that UJ; is an interval. 

7. Let X be a metric space with metric d. If z isa point of X and 
A a subset of X, show the following: if A is non-empty, d(z,A) is a 
non-negative real number; and d(z,A) = + © is empty. 

8. Let X be a metric space with metric d and A a subset of X. Show 
the following: if A is non-empty, d(A) is a non-negative extended 
real number; d(A) = — © =A is empty; and if A is bounded, it is 
non-empty. 


10. OPEN SETS 


Let X be a metric space with metric d. If 2 is a point of X andr 
is a positive real number, the open sphere S,(xo) with center x» and radius r 
is the subset of X defined by 


S-(ao) = {x:d(x,%0) <r}. 


An open sphere is always non-empty, for it contains its center. In 
Example 9-1, an open sphere with radius 1 contains only its center. 
S,(Zo) is often called the open sphere with radius r centered on x9; intui- 
tively, it consists of all points in X which are ‘“‘close”’ to 29, with the degree 
of closeness given by r. 

A few concrete examples are in order. It should be easy to visualize 
the open sphere S,(zo) on the real line: it is the bounded open interval 
(to — r, 29 + 7) with mid-point zp» and total length 27. Conversely, it is 
clear that any bounded open interval on the real line is an open sphere, 
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so the open spheres on the real line are precisely the bounded open 
intervals. The open sphere S,(z) in the complex plane (see Fig. 17) 
is the inside of the circle with center z and radius r. Figure 18 illus- 
trates an open sphere in the space @[0,1]: S,(fo) consists of all functions f 
in @[0,1] whose graphs lie within the shaded band of vertical width 2r 
centered on the graph of fo. 

A subset G of the metric space X is called an open set if, given any 
point x in G, there exists a positive real number r such that S,(x) C G, 


le-zgl<r 


Fig. 17. An open sphere in the complex Fig. 18. An open sphere in @ [0,1]. 
plane. 


that is, if each point of G is the center of some open sphere contained in 
G. Loosely speaking, a set is open if each of its points is “inside” the 
set, in the sense made precise by the definition. On the real line, a set 
consisting of a single point is not open, for each bounded open interval 
centered on the point contains points not in the set. Similarly, the 
subset [0,1) of the real line is not open, because the point 0 in [0,1) has 
the property that each bounded open interval centered on it (no matter 
how small it may be) contains points not in (0,1), e.g., negative points. 
If we omit the offending point 0, the resulting bounded open interval 
(0,1) is an open set (this is very easy to prove and is a special case of 
Theorem B below). Further, it is quite clear that any open interval— 
bounded or not—is an open set, and also that the open intervals are 
the only intervals which are open sets. 


Theorem A. In any metric space X, the empty set % and the full space X 
are open sets. 

Proor. To show that § is open, we must show that each point in § is 
the center of an open sphere contained in 9; but since there are no points 
in §, this requirement is automatically satisfied. X is clearly open, since 
every open sphere centered on each of its points is contained in X. 
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We have seen that [0,1) is not open as a subset of the real line. 
However, if we consider [0,1) as a metric space X in its own right, as a 
subspace of the real line, then [0,1) is open as a subset of X, since from 
this point of view it is the full space. This apparent paradox disappears 
when we realize that points outside of a given metric space have no 
relevance to any discussion taking place within the context of that space. 
A set is open or not open only with respect to a specific metric space con- 
taining it, never on its own. 

Our next theorem justifies the adjective in the expression “open 
sphere.” 


Theorem B. In any metric space X, each open sphere is an open set. 
PROOF. Let S,(zo) be an open sphere in X, and let x be a point in S,(xo). 
We must produce an open sphere centered on x and contained in S,(xp). 
Since d(x,20) <7, 71 = r — d(z,xz9) is a positive real number. We show 
that S,,(z) C S,(ao). If y is a point in S,,(z), so that d(y,z) < 7, then 
a(y,xo) < d(y,x) + d(x,20) <1" + d(x,2) = [r = d(x,20)} + d(x,2) =P 
shows that y is in S,(z0). 


The following characterization of open sets in terms of open spheres 
is a useful tool. 


Theorem C. Let X be a metric space. A subset G of X is open > tt is a 
union of open spheres. 

PRooF. We assume first that G is open, and we show that it is a union 
of open spheres. If G is empty, it is the union of the empty class of 
open spheres. If G@ is non-empty, then since it is open, each of its 
points is the center of an open sphere contained in it, and it is the union 
of all the open spheres contained in it. 

We now assume that G is the union of a class S of open spheres. We 
must show that G is open. If S is empty, then G is also empty, and by 
Theorem A, G is open. Suppose that Sis non-empty. G is also non- 
empty. Letzbea point inG. Since G is the union of the open spheres 
in S, xz belongs to an open sphere S,(zo) in S. By Theorem B, z is the 
center of an open sphere S,,(z) C S,(zo). Since S,(2o) C G, S,,(z) € Gand 
we have an open sphere centered on x and contained in G. G is there- 
fore open. 


The fundamental properties of the open sets in a metric space are 
those stated in 


Theorem D. Let X be a metric space. Then (1) any union of open sets 
in X is open; and (2) any finite intersection of open sets in X is open. 

PRooF. To prove (1), let {G,} be an arbitrary class of open sets in X. 
We must show that G = U,G; is open. If {G;} is empty, then G is 
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empty, and by Theorem A, Gis open. Suppose that {G;} is non-empty. 
By Theorem C, each G; (being an open set) is a union of open spheres; G 
is the union of all the open spheres which arise in this way; and by another 
application of Theorem C, G is open. 

To prove (2), let {G,} be a finite class of open sets in X. We must 
show that G = (.G; is open. If {G,} is empty, then G = X; and by 
Theorem A, G is open. Suppose that {G;} is non-empty and that 
{Gs} = {Gi, Go, . . . , Ga} for some positive integer n. If G happens 
to be empty, then it is open by Theorem A, so we may assume that G is 
non-empty. Let x be a pointin G. Since x is in each G;, and each G; is 
open, for each 7 there is a positive real number 7; such that S,,(z) C Gy. 
Let r be the smallest number in the set {r1, 72, . . . , fn}. This number 
r is a positive real number such that S,(z) C S,,(x) for each 7, so S,(z) © 
G; for each 7, and therefore S,(x) C G. Since S,(z) is an open sphere 
centered on x and contained in G, G is open. 


The above theorem says that the class of all open sets in a metric 
space is closed under the formation of arbitrary unions and finite inter- 
sections. The reader should clearly understand that Theorem A is an 
immediate consequence of this statement, since the empty set is the 
union of the empty class of open sets and the full space is its intersection. 
The limitation to finite intersections in this theorem is essential. To see 
this, it suffices to consider the following sequence of open intervals on the 
real line: 


(—1,1), (-,), (-\%,%), cee 


The intersection of these open sets is the set {0} consisting of the single 
point 0, and this set is not open. 

In an arbitrary metric space, the structure of the open sets can be 
very complicated indeed. Theorem C contains the best information 
available in the general case: each open set is a union of open spheres. 
In the case of the real line, however, a description can be given of the 
open sets which is fairly explicit and reasonably satisfying to the intuition. 


Theorem E, Every non-empty open set on the real line is the union of a 
countable disjoint class of open intervals. 

proor. Let G be a non-empty open subset of the real line. Let 2 bea 
point of G. Since G is open, x is the center of a bounded open interval 
contained in G. Define J, to be the union of all the open intervals 
which contain x and are contained in G. The following three facts are 
easily proved: J, is an open interval (by Theorem D and Problem 9-6) 
which contains z and is contained inG; J, contains every open interval 
which contains x and is contained in G; and if y is another point in J;, 
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then J, = I,. We next observe that if x and y are any two distinct 
points of G, then J, and J, are either disjoint or identical; for if they 
have a common point z, then J, = I, and I, = I.,so Iz = Iy. Consider 
the class | of all distinct sets of the form J, for pointszin G. This isa 
disjoint class of open intervals, and G is obviously its union. It remains 
to be proved that | is countable. Let G, be the set of rational points in 
G. G, is clearly non-empty. We define a mapping f of G, onto | as 
follows: for each r in G,, let f(r) be that unique interval in | which con- 
tains r. G, is countable by Problem 6-7, and the fact that | is countable 
follows from Problem 6-8. 


A firm grasp of the ideas involved in the theory of metric spaces 
depends on one’s capacity to “‘see” these spaces with the mind’s eye. 
The complex plane is perhaps the 
best metric space to use as a model 
from which to absorb this necessary 
intuitive understanding. When we 
consider an unspecified set A of com- 
plex numbers, we usually imagine 
it as a region bounded by a curve, 
as in Fig. 19. We think of the point 
21, which is completely surrounded 
by points of A, as being “inside” 
the set A, or in its “interior,” while 
z2 ison the “boundary” of A. More Fig. 19. A set A of complex numbers 
precisely, z, is the center of some with interior point 2, and boundary 
open sphere contained in A, and _ point az. 
each open sphere centered on 2 in- 
tersects both A and its complement A’. We formulate these ideas for a 
general metric space in the next paragraph and at the end of the next 
section. 

Let X be an arbitrary metric space, and let A be a subset of X. A 
point in A is called an interior point of A if it is the center of some open 
sphere contained in A; and the interior of A, denoted by Int(A), is the set 
of all its interior points. Symbolically, 


Int(A) = {z:2¢ A and S,(z) C A for some r}. 


The basic properties of interiors are the following: 
(1) Int(A) is an open subset of A which contains every open subset 
of A (this is often expressed by saying that the interior of A is 
the largest open subset of A); 
(2) Aisopen =A = Int(A); 
(3) Int(A) equals the union of all open subsets of A. 
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The proofs of these facts are quite easy, and we ask the reader to fill in 
the details as an exercise (see Problem 8). 


Problems 


1. 


Let X be a metric space, and show that any two distinct points of X 
can be separated by open spheres in the following sense: if x and y 
are distinct points in X, then there exists a disjoint pair of open 
spheres each of which is centered on one of the points. 

Let X be a metric space. If {x} is a subset of X consisting of a 
single point, show that its complement {2z}’ is open. More gen- 
erally, show that A’ is open if A is any finite subset of X. 

Let X be a metric space and S,(x) the open sphere in X with center 
zx and radius r. Let A be a subset of X with diameter less than r 
which intersects S,(z). Prove that A C Sz,(z). 

Let X be a metric space. Show that every subset of X is open = 
each subset of X which consists of a single point is open. 

Let X be a metric space with metric d, and let d; be the metric 
defined in Problem 9-1. Show that the two metric spaces (X,d) 
and (X,d;) have precisely the same open sets. (Hint: show that 
they have the same open spheres with one exception. What is this 
exception ?) 

If X = X1 X X2 XK ++: XK Xn is the product in Problem 94, and 
if d and d are the metrics on X defined in that problem, show that 
the two metric spaces (X,d) and (X,d) have precisely the same 
open sets. Observe that in this case the spaces do not have the 
same open spheres. 

Let Y be a subspace of a metric space X, and let A be a subset of the 
metric space Y. Show that A is open as a subset of Y © it is the 
intersection with Y of a set which is open in X. 

Prove the statements made in the text about interiors. 

Describe the interior of each of the following subsets of the real 
line: the set of all integers; the set of all rationals; the set of all 
irrationals; (0,1); [0,1]; [0,1) U {1,2}. Do the same for each of the 
following subsets of the complex plane: {z:|z| < 1}; {z:|z] < 1}; 
{z:I(z) = 0}; {z:R(z) is rational}. 

Let A and B be two subsets of a metric space X, and prove the 
following: 

(a) Int(A) U Int(B) C Int(A U B); 

(6) Int(A) MO Int(B) = Int(A AM B). 

Give an example of two subsets A and B of the real line such that 
Int(A) U Int(B) ¥ Int(A U B). 
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11. CLOSED SETS 


Let X be a metric space with metric d. If A is a subset of X, a 
point x in X is called a limit point of A if each open sphere centered on z 
contains at least one point of A different from x. The essential idea here 
is that the points of A different from z get “arbitrarily close’ to x, or 
“pile up”’ at x. 

The subset {1, 4%, %, . . .} of the real line has 0 as a limit point; 
in fact, 0 is its only limit point. The closed-open interval {0,1) has 0 as a 
limit point which is in the set and 1 as a limit point which is not in the 
set; further, every real number 2 such that 0 < x < 1 is also a limit 
point of this set. The set of all integral points on the real line has no 
limit points at all, whereas every real number is a limit point of the set of 
all rationals. In Example 9-1, every open sphere of radius less than 1 
consists only of its center, so no subset of this space has any limit points. 

A subset F of the metric space X is called a closed set if it contains 
each of its limit points. In rough terms, a set is closed if its points do not 
get arbitrarily close to any point outside of it. Among the subsets of the 
real line mentioned in the preceding paragraph, only the set of integral 
points is closed. In Example 9-1, every subset is closed. 


Theorem A. In any metric space X, the empty set 9 and the full space X 
are closed sets. 

PRooF. The empty set has no limit points, so it contains them all and is 
therefore closed. Since the full space X contains all points, it auto- 
matically contains its own limit points and thus is closed. 


The following theorem characterizes closed sets in terms of open sets. 
We already know a good deal about open sets, so this characterization 
provides us with a useful tool for establishing properties of closed sets. 


Theorem B. Let X be a metric space. A subset F of X is closed = tts 
complement F’ is open. 


proor. Assume first that F is closed. We show that F’ is open. If 
F’ is empty, it is open by Theorem 10-A, so we may suppose that F’ is 
non-empty. Letzbeapointin F’. Since F is closed and zis not in F, x 
is not a limit point of F. Since z is not in F and is not a limit point of F, 
there exists an open sphere S,(x) which is disjoint from F. S,(x) is an 
open sphere centered on z and contained in F’, and since x was taken to 
be any point of F’, F’ is open. 

We now assume that F’ is open and show that Fis closed. The only 
way F can fail to be closed is to have a limit point in F’. This cannot 
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happen, for since F’ is open, each of its points is the center of an open 
sphere disjoint from F,, and no such point can be a limit point of F. 


If zo is a point in our metric space X, and r is a non-negative real 
number, the closed sphere S,{xo] with center 2) and radius r is the subset 
of X defined by 


S,[ao] = {x:d(x,2») < r}. 


S,[{zo] contains its center, and when r = 0 it contains only its center. 
The closed spheres on the real line are precisely the closed intervals. In 
this connection, we observe that though open spheres on the real line are 
open intervals, there are open intervals which are not open spheres, e.g., 
oe sd yt bd ). 

The following theorem justifies the adjective in the phrase ‘‘closed 
sphere.”’ 


Theorem C. In any metric space X, each closed sphere is a closed set. 
prooF. Let S,[xo] be a closed sphere in X. By Theorem B, it suffices 
to show that its complement S,{xo]’ is open. S,{x9]’ is open if it is empty, 
sO we may assume that it is non-empty. Let x be a point in S,{2J’. 
Since d(x,29) > r, r1 = d(z,x0) — r is a positive real number. We take 
7, as the radius of an open sphere S,,(x) centered on x, and we show that 
S,[zo]’ is open by showing that S,,(z) C S,[xo]’. Let y be a point in 
S,,(z), so that d(y,z) <7. On the basis of this and the fact that 
d(xo,2) < d(xo,y) + d(y,2), we see that 


d(y,%0) 2 d(x,x0) pa d(y,x) > d(x,2) N= d(x,20) i [d(x,20) 7] =?, 
so that y is in S,[29]’. 


The main general facts about closed sets are those given in our next 
theorem. 


Theorem D. Let X be a metric space. Then (1) any intersection of closed 
sets in X ts closed; and (2) any finite union of closed sets in X its closed. 
PrRooF. By virtue of Eqs. 2-(2) and Theorem B above, this theorem is 
an immediate consequence of Theorem 10-D. We prove (1) as follows. 
If {F;} is an arbitrary class of closed subsets of X and F = (),F;, then 
by Theorem B, F is closed if F’ is open; but F’ = U,F;’ isopen by Theorem 
10-D, since by Theorem B each F,/’ is open. The second statement is 
proved similarly. 


In Theorem E of the previous section, we gave an explicit charac- 
terization of the open sets on the real line. We now consider the struc- 
ture of its closed sets. Among the simplest closed sets on the real line 
are the closed intervals (which are the closed spheres) and finite unions 
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of closed intervals. Finite sets are included among these, since a set 
consisting of a single point is a closed interval with equal end-points. 
What is the character of the most general closed set on the real line? 
Since closed sets are the complements of open sets, Theorem 10-E gives 
a complete answer to this question: the most general proper closed subset 
of the real line is obtained by remov- 
ing a countable disjoint class of open 
intervals. This process sounds inno- 
cent enough, but in fact it leads to 
some rather curious and complicated 
examples. One of these examples 
is of particular importance. It was 
studied by Cantor and is usually —.----—~————-- o—+---0—« 
called the Cantor set. 

To construct the Cantor set, we 
proceed as follows (see Fig. 20).  s-cc---ce-ee ——————ee-ce---0-00 
First, denote the closed unit interval Fig. 20. The Cantor set. 
[0,1] by Fi. Next, delete from F, 
the open interval (14,24) which is its middle third, and denote the 
remaining closed set by F2. Clearly, 


F,= [0,34] U [3,1]. 


Next, delete from F, the open intervals (346,26) and (7,86), which are 
the middle thirds of its two pieces, and denote the remaining closed set 
by F3. It is easy to see that 


Fs = [0,24] VU [36,24] U [34,76] VU [36,1]. 


If we continue this process, at each stage deleting the open middle third 
of each closed interval remaining from the previous stage, we obtain a 
sequence of closed sets F,,, each of which contains all its successors. The 
Cantor set F is defined by 


F= (\yo1 Fay, 


and it is closed by Theorem D. F consists of those points in the closed 
unit interval [0,1] which “ultimately remain” after the removal of all the 
open intervals (14,24), (34,36), (%,8), . .. . What points do remain? 
F clearly contains the end-points of the closed intervals which make up 
each set F,: 


0, 1, 4, 34, 46, 36, %, 34, Paves: 


Does F contain any other points? We leave it to the reader to verify 
that 14 isin F and is not an end-point. Actually, F contains a multitude 
of points other than the above end-points, for the set of these end-points 
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is clearly countable, while the cardinal number of F itself is c, the cardinal 
number of the continuum. To prove this, it suffices to exhibit a one-to- 
one mapping f of [0,1) into F. We construct such a mapping as follows. 
Let x be a point in (0,1), and let z = .b:b2b; . . . be its binary expansion 
(see Sec. 7). Eachb,iseitherOorl. Lett, = 2b,, and regard .tyfots . .. 
as the ternary expansion of a real number f(x) in (0,1). The reader will 
easily convince himself that f(z) is in the Cantor set F: since ¢, is 0 or 2, 
f(z) is not in [14,24); since é, is 0 or 2, f(z) is not in [14,26) or [%,86); etc. 
Also, it is easy to see that the mapping f: [0,1) > F is one-to-one. 
According to this, F contains exactly as many points as the entire closed 
unit interval [0,1]. It is interesting to compare this conclusion with the 
fact that the sum of the lengths of all the open intervals removed is 
precisely 1, since 


Wy+3+ Sat -0° = 1 


Tt is also interesting to observe (by doing a little arithmetic) that F25 is 
the union of 16,777,216 disjoint closed intervals of the same length which 
are rather irregularly distributed along [0,1]. These facts may suffice to 
indicate that the Cantor set is a very intricate mathematical object and 
is just the sort of thing mathematicians delight in. We shall encounter 
this set again from time to time, for its properties illustrate several 
phenomena discussed in later sections. 

We conclude this section by defining two additional concepts which 
are often useful. 

Let X be an arbitrary metric space, and let A bea subset of X. The 
closure of A, denoted by A, is the union of A and the set of all its limit 
points. Intuitively, A is A itself together with all other points in X 
which are arbitrarily close to A. As an example, if A is the open unit 
disc {z:|z| <1} in the complex plane, then A is the closed unit disc 
{z:|z| < 1}. The main facts about closures are the following: 

(1) A is a closed superset of A which is contained in every closed 

superset of A (we express this by saying that A is the smallest 
closed superset of A); 

(2) Aisclosed = A = A; 

(3) A equals the intersection of all closed supersets of A. 

It is a routine exercise to prove these statements, and we leave this task 
to the reader (in Problem 6). 

Our second concept relates to the discussion of Fig. 19 given at the 
end of the previous section. Again, let X be a metric space and A a 
subset of X. A point in X is called a boundary point of A if each open 
sphere centered on the point intersects both A and A’, and the boundary 
of A is the set of all its boundary points. This concept possesses the 
following properties: 
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(1) the boundary of A equals A (\ A’; 
(2) the boundary of A is a closed set; 
(3) <A is closed & it contains its boundary. 


We ask the reader to give the proofs in Problem 11. 


Problems 


1. 


am 


Let X be a metric space, and extend Problem 10-1 by proving the 

following statements: 

(a) any point and disjoint closed set in X can be separated by open 
sets, in the sense that if z is a point and F a closed set which 
does not contain z, then there exists a disjoint pair of open 
sets G, and G. such that z¢ G, and F C G,; 

(b) any disjoint pair of closed sets in X can be separated by open 
sets, in the sense that if F, and F2 are disjoint closed sets, then 
there exists a disjoint pair of open sets G,; and G, such that 
F, G G; and F, Cc Go. 

Let X be a metric space, and let A be a subset of X. If xis a limit 

point of A, show that each open sphere centered on x contains an 

infinite number of distinct points of A. Use this result to show that 

a finite subset of X is closed. 

Show that a subset of a metric space is bounded © it is non-empty 

and is contained in some closed sphere. 

Give an example of an infinite class of closed sets whose union is not 

closed. Give an example of a set which (a) is both open and closed; 

(b) is neither open nor closed; (c) contains a point which is not a 

limit point of the set; and (d) contains no point which is not a limit 

point of the set. 

Describe the interior of the Cantor set. 

Prove the statements made in the text about closures. 

Let X be a metric space and A a subset of X. Prove the following 

facts: 

(a) A’ = Int(A’); 

(b) A = {x:d(z,A) = 0}. 

Describe the closure of each of the following subsets of the real line: 

the integers; the rationals; the Cantor set; (0, + ©); (—1,0) U (0,1). 

Do the same for each of the following subsets of the complex plane: 

{z:|2e| is rational}; {z:1/R(z) is an integer}; {z:|z] <1 and 

I(z) < O}. 

Let X be a metric space, let x be a point of X, and let r be a positive 

real number. One is inclined to believe that the closure of S,(x) 

must equal S,[{x]._ Give an example to show that this is not neces- 

sarily true. (Hint: see Example 9-1.) 
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10. Let X be a metric space, and let G be an open set in X. Prove that 
G is disjoint from a set A © G is disjoint from A. 

11. Prove the facts about boundaries stated in the text. 

12. Describe the boundary of each of the following subsets of the real 
line: the integers; the rationals; [0,1]; (0,1). Do the same for each 
of the following subsets of the complex plane: {z:|z| < 1}; {z:|z| < 
1}; {2:I(z) > O}. 

13. Let X be a metric space and A a subset of X. A is said to be dense 
(or everywhere dense) if A = X. Prove that A is dense = the only 
closed superset of A is X = the only open set disjoint from A is 
§ <= A intersects every non-empty open set © A intersects every 
open sphere. 


12. CONVERGENCE, COMPLETENESS, AND BAIRE’S THEOREM 


As we emphasized in the introduction to this chapter, one of our 
main aims in considering metric spaces is to study convergent sequences 
in a context more general than that of classical analysis. The fruits of 
this study are many, and among them is the added insight gained into 
ordinary convergence as it is used in analysis. 

Let X be a metric space with metric d, and let 


(tay H 12 a oe oy Be wet 


be a sequence of points in X. We say that {z,} is convergent if there 
exists a point 2 in X such that either 
(1) for each ¢ > 0, there exists a positive integer no such that 
N > No = A(xn, x) < €; or equivalently, 
(2) for each open sphere S,(x) centered on z, there exists a positive 
integer mo such that x, is in S,(xz) for all n > no. 
The reader should observe that the first condition is a direct generaliza- 
tion of convergence for sequences of numbers as defined in the introduc- 
tion, and that the second can be thought of as saying that each open 
sphere centered on x contains all points of the sequence from some place 
on. If we rely on our knowledge of what is meant by a convergent 
sequence of real numbers, the statement that {za} is convergent can 
equally well be defined as follows: there exists a point x in X such that 
d(x,,z) —> 0. We usually symbolize this by writing 


tn — 2, 


and we express it verbally by saying that z, approaches x, or that x, 
converges to x. It is easily seen from condition (2) and Problem 10-1 that 
the point x in this discussion is unique, that is, that z, > y with y ¥ x 
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is impossible. The point z is called the limit of the sequence {z,}, and 
we sometimes write x, — x in the form 


lim z, = 2. 
The statements In—> 2 and lim z, = 2 


mean exactly the same thing, namely, that {z,} is a convergent sequence 
with limit z. 

Every convergent sequence {zn} has the following property: for each 
e > 0, there exists a positive integer 7» such that m,n > no => d(2m,2n) < 
e. For if z,— 2, then there exists a positive integer mo such that n > 
No => d(aa,z) < e/2, and from this we see that 


m,n > no => d(tm tn) < d(Xm,x) + d(x,%n) < ¢/2 + 6/2 =. 


A sequence with this property is called a Cauchy sequence, and we have 
just shown that every convergent sequence is a Cauchy sequence. 
Loosely speaking, this amounts to the statement that if the terms of a 
sequence approach a limit, then they get close to one another. It is of 
basic importance to understand that the converse of this need not be true, 
that is, that a Cauchy sequence is not necessarily convergent. As an 
example, consider the subspace X = (0,1] of the real line. The sequence 
defined by x, = 1/n is easily seen to be a Cauchy sequence in this space, 
but it is not convergent, since the point 0 (which it wants to converge to) 
is not a point of the space. The difficulty which arises in this example 
stems from the fact that the notion of a convergent sequence is not 
intrinsic to the sequence itself, but also depends on the structure of the 
space in which it lies. A convergent sequence is not convergent ‘‘on its 
own”; it must converge to some point in the space. Some writers 
emphasize the distinction between convergent sequences and Cauchy 
sequences by calling the latter ‘intrinsically convergent’’ sequences. 

A complete metric space is a metric space in which every Cauchy 
sequence is convergent. In rough terms, a metric space is complete if 
every sequence in it which tries to converge is successful, in the sense that 
it finds a point in the space to converge to. The space (0,1] mentioned 
above is not complete, but it evidently can be made so by adjoining the 
point 0 to it to form the slightly larger space [0,1]. As a matter of fact, 
any metric space, if it isn’t already complete, can be made so by suitably 
adjoining additional points. We outline this process in a problem at the 
end of Sec. 14. 

It is a fundamental fact of elementary analysis that the real line is a 
complete metric space. The complex plane is also complete, as we see 
from the following argument. Let {z,}, where z, = Ga + tba, be a 
Cauchy sequence of complex numbers. Then {a,} and {b,} are them- 
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selves Cauchy sequences of real numbers, since 


|Qm =7 a,| < |2mn a 2n| 
and |bm — bal < lem — Znl. 


By the completeness of the real line, there exist real numbers a and b such 
that a, a and b,—> b. If we now put z = a+ ib, then we see that 
2n — 2 by means of 


l@n — 2| = |(dn + thn) — (a + 20)| 
= \(Qn = a) + 4(0n —~ b)| 
< la, — a] + |b, — 5} 


and the fact that both final terms on the right approach 0. The com- 
pleteness of the complex plane thus depends directly on the completeness 
of the real line. The metric space defined in Example 9-1 is also complete; 
for in this space a Cauchy sequence must be constant (i.e., it must consist 
of a single point repeated) from some place on, and it converges with that 
point as its limit. 

The first three of the five metric spaces given as examples in Sec. 9 
are therefore complete. What about the last two? 

We ask the reader to show in Problem 5 that Example 9-4 is not 
complete. The problem of completing this space leads to the modern 
theory of Lebesgue integration, and it would carry us too far afield to 
pursue this matter to its natural conclusion. 

On the other hand, the space @[0,1] defined in Example 9-5 is com- 
plete. We prove this in a more general form in Sec. 14. The complete- 
ness of this space, and of others similar to it, is one of the major focal 
points of topology and modern analysis. 

The terms lint and limit point are often a source of confusion for 
people not thoroughly accustomed to them. On the real line, for 
instance, the constant sequence {1, 1,..., 1, .. .} is convergent 
with limit 1; but the set of points of this sequence is the set consisting of 
the single element 1, and by Problem 11-2, the point 1 is not a limit point 
of this set. ‘The essence of the matter is that a sequence of points in a set 
is not a subset of the set: it is a function defined on the positive integers 
with values in the set, and is usually specified by listing its values, as in 
{an} = {21, Z2, ...,2n, -- -}, where 2, is the value of the function at 
the integer n. A sequence may have a limit, but cannot have a limit 
point; and the set of points of a sequence may have a limit point, but 
cannot have a limit. The following theorem relates these concepts to 
one another and is a useful tool for some of our later work. 


Theorem A. If a convergent sequence in a metric space has infinitely many 
distinct points, then its limit is a limit point of the set of points of the sequence. 
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proor. Let X be a metric space, and let {z,} be a convergent sequence 
in X with limit z. We assume that z is not a limit point of the set of 
points of the sequence, and we show that it follows from this that the 
sequence has only finitely many distinct points. Our assumption 
implies that there exists an open sphere S,(x) centered on x which con- 
tains no point of the sequence different fromz. However, since z is the 
limit of the sequence, all z,’s from some place on must lie in S,(x), hence 
must coincide with z. From this we see that there are only finitely many 
distinct points in the sequence. 


Our next theorem guarantees the completeness of many metric 
spaces which arise as subspaces of complete metric spaces. 


Theorem B. Let X be a complete metric space, and let Y be a subspace of 
X. Then Y is complete = it ts closed. 
PROOF. We assume first that Y is complete as a subspace of X, and we 
show that it is closed. Let y be a limit point of Y. For each positive 
integer n, Si(y) contains a point y, in Y. It is clear that {y,} con- 
verges to y in X and is a Cauchy sequence in Y, and since Y is complete, 
yisin Y. Y is therefore closed. 

We now assume that Y is closed, and we show that it is complete. 
Let {yn} be a Cauchy sequence in Y. It is also a Cauchy sequence in X, 
and since X is complete, {y,} converges to a point rin X. We show that 
zisin Y. If {y,} has only finitely many distinct points, then z is that 
point infinitely repeated and is thusin Y. On the other hand, if {y,} has 
infinitely many distinct points, then, by Theorem A, x is a limit point of 
the set of points of the sequence; it is therefore also a limit point of Y, and 
since Y is closed, x is in Y. 


A sequence {A,} of subsets of a metric space is called a decreasing 


sequence if 
A,2A:,2D4A;2°°° 


The following theorem gives conditions under which the intersection of 
such a sequence is non-empty. 


Theorem C (Cantor's Intersection Theorem). Let X be a complete metric 
space, and let {F,,} be a decreasing sequence of non-empty closed subsets of X 
such that d(F,) > 0. Then F = (\3_, F, contains exactly one point. 

PROOF. It is first of all evident from the assumption d(F,) — 0 that F 
cannot contain more than one point, so it suffices to show that F is non- 
empty. Let z, be a point in F,. Since d(F,) 0, {2,} is a Cauchy 
sequence. Since X is complete, {z,} hasa limit 2. We show that z isin 
F, and for this it suffices to show that z is in F,, for a fixed but arbitrary 
no. If {2,} has only finitely many distinct points, then zx is that point 
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infinitely repeated, and is therefore in F,,. If {z,} has infinitely many 
distinct points, then z is a limit point of the set of points of the sequence, 
it is a limit point of the subset {z,: 2 > no} of the set of points of the 
sequence, it is a limit point of F,,,, and thus (since F,, is closed) it is in F,,,. 


A subset A of a metric space is said to be nowhere dense if its closure 
has empty interior. It is easy to see that A is nowhere dense = A does 
not contain any non-empty open set <> each non-empty open set has a 
non-empty open subset disjoint from A + each non-empty open set has a 
non-empty open subset disjoint from A = each non-empty open set con- 
tains an open sphere disjoint from A. Ifa nowhere dense set is thought of 
as a set which doesn’t cover very much of the space, then our next 
theorem says that a complete metric space cannot be covered by any 
sequence of such sets. 


Theorem D. If {A,} 1s a sequence of nowhere dense sets in a complete metric 
space X, then there exists a point in X which is not in any of the Ay’s. 
proor. For the duration of this proof, we abandon our usual notations 
for open spheres and closed spheres. Since X is open and A, is nowhere 
dense, there is an open sphere S, of radius less than 1 which is disjoint 
from A;. Let F, be the concentric closed sphere whose radius is one-half 
that of S:, and consider its interior. Since Az is nowhere dense, Int(F1) 
contains an open sphere S, of radius less than 4 which is disjoint from Ag. 
Let F2 be the concentric closed sphere whose radius is one-half that of S2, 
and consider its interior. Since A; is nowhere dense, Int(F2) contains an 
open sphere 8; of radius less than 14 which is disjoint from A;. Let F; be 
the concentric closed sphere whose radius is one-half that of S;. Con- 
tinuing in this way, we get a decreasing sequence {F,} of non-empty 
closed subsets of X such that d(F,)— 0. Since X is complete, Theorem 
C guarantees that there exists a point z in X which is in all the F,’s. 
This point is clearly in all the S,’s, and therefore (since S, is disjoint from 
A,) it is not in any of the A,’s. 


For our purposes, the following equivalent form of Theorem D is 
often more convenient. 


Theorem E. If a complete metric space is the union of a sequence of tts 
subsets, then the closure of at least one set in the sequence must have non- 
empty interior. 


Theorems D and E are really one theorem expressed in two different 
ways. We refer to both (or either) as Baire’s theorem. This theorem is 
admittedly rather technical in nature, and the reader can hardly be 
expected to appreciate its significance at the present stage of our work. 
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He will find, however, that a need for it crops up from time to time, and 
when this need arises, Baire’s theorem is an indispensable tool.! 


Problems 


1. Let X be a metric space. If {z,} and {y,} are sequences in X such 
that z, > z and y,— y, show that d(%n,yn) > d(z,y). 
2. Show that a Cauchy sequence is convergent — it has a convergent 


subsequence. 
3. If X = Xi X X2X - +--+: X Xnis the product in Problem 9-4, and if 
each of the coordinate spaces X;, X2, ..., Xn is complete, show 


that X is complete with respect to each of the metrics d and d defined 
in that problem. 

4. Let X be any non-empty set. By Problem 9-5, the set of all bounded 
real functions defined on X is a metric space with respect to the 
metric induced by the norm defined in that problem. Show that this 
metric space is complete. (Hint: if {f,} is a Cauchy sequence, then 
{fn(z)} is a Cauchy sequence of real numbers for each point z in X.) 

5. In Example 9-4, show that the following functions f, defined on [0,1] 
form a Cauchy sequence in this space which is not convergent: 
f(z) =1 if OS tS, faz) = —2t- ) +1 if Wat 
lg + (44)", and f.(z) = Oif 146 + (4)" <2 < 1. 

6. Give an example to show that the set F in Cantor’s intersection 
theorem may be empty if the hypothesis d(F,) — 0 is dropped. 

7. Show that a closed set is nowhere dense — its complement is every- 
where dense. 

8. Show that the Cantor set is nowhere dense. 


13. CONTINUOUS MAPPINGS 


In the previous section we extended the idea of convergence to the 
context of a general metric space. We now do the same for continuity. 
Let X and Y be metric spaces with metrics d, and dz, and let f be a 
mapping of X into Y. f is said to be continuous at a point xo in X if either 


1 There is some rather undescriptive terminology which is often used in connection 
with Baire’s theorem. We shall not make use of it ourselves, but the reader ought 
to be acquainted with it. A subset of a metric space is called a set of the first category 
if it can be represented as the union of a sequence of nowhere dense sets, and a set of 
the second category if it is not a set of the first category. Baire’s theorem—sometimes 
called the Baire category theorem—can now be expressed as follows: any complete 
metric space (considered as a subset of itself) is a set of the second category. 
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of the following equivalent conditions is satisfied: 
(1) for each ¢ > 0 there exists 6 > 0 such that di(z,29) < 6= 
d2(f(z),f(t0)) < €; 
(2) for each open sphere S.(f(xo)) centered on f(20) there exists an 
open sphere S;(x,) centered on xo such that f(S3(10)) © S.(f(z0)). 
The reader will notice that the first condition generalizes the elementary 
definition given in the introduction to this chapter, and that the second 
translates the first into the language of open spheres. 
Our first theorem expresses continuity at a point in terms of 
sequences which converge to the point. 


Theorem A. Let X and Y be metric spaces and f a mapping of X into Y. 
Then f ts continuous at xo if and only if t, > 2 = f (tn) — f (20). 

PROOF. We first assume that f is continuous ata. If {z,} isa sequence 
in X such that x, — xo, we must show that f(zn) > f(ao). Let S.(f(20)) 
be an open sphere centered on f(x»). By our assumption, there exists an 
open sphere S;(x») centered on x such that f(S3(20)) C S.(f(a0)). Since 
2n—> %, all z,’s from some place on lie in S;(z0). Since f(Ss(%)) © 
S.(f(xo)), all f(azn)’s from some place on lie in S.(f(zo)). We see from 
this that f(tn) > f (x0). 

To prove the other half of our theorem, we assume that f is not 
continuous at x», and we show that xz, — xo does not imply f(zn) — f(20). 
By this assumption, there exists an open sphere S,(f(x0)) with the 
property that the image under f of each open sphere centered on zp is not 
contained in it. Consider the sequence of open spheres Si(z0), Sy(xo), 
.. +, Sim(zo), .... Form a sequence {z,} such that a, € Sijn(20) 
and f(t) ¢S.(f(ao)). It is clear that x, converges to 2) and that f(z,) 
does not converge to f(2o). 


A mapping of one metric space into another is said to be continuous 
if it is continuous at each point in its domain. The following theorem is 
an immediate consequence of Theorem A and this definition. 


Theorem B. Let X and Y be metric spaces and f a mapping of X into Y. 
Then f is continuous if and only if t1 > x > f(an) > f(x). 


This result shows that continuous mappings of one metric space into 
another are precisely those which send convergent sequences into con- 
vergent sequences, or, in other words, which preserve convergence. Our 
next theorem characterizes continuous mappings in terms of open sets. 


Theorem C. Let X and Y be metric spaces and f a mapping of X into Y. 
Then f ts continuous = f-!(G) is open in X whenever G is open in Y. 


PROOF. We first assume that f is continuous. If G is an open set in Y, 
we must show that f-'(G) is open in X. f—!(@) is open if it is empty, so 
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we may assume that it is non-empty. Let x bea point inf—'(G). Then 
f(x) is in G, and since G is open, there exists an open sphere S,(f(z)) 
centered on f(x) and contained in G. By the definition of continuity, 
there exists an open sphere S;(xr) such that f(S;(z)) C S.(f(z)). Since 
S.(f{(x)) € G, we also have f(S;(z)) GC G, and from this we see that 
Si(z) C f-(@). S3(x) is therefore an open sphere centered on x and 
contained in f—!(G), so f—1(G) is open. 

We now assume that f-'(G@) is open whenever G is, and we show 
that f is continuous. We show that f is continuous at an arbitrary 
point x in X. Let S,(f(z)) be an open sphere centered on f(z). This 
open sphere is an open set, so its inverse image is an open set which 
contains x. By this, there exists an open sphere S;(z) which is contained 
in this inverse image. It is clear that f(S;(x)) is contained in S,(f(z)), 
so f is continuous at xz. Finally, since z was taken to be an arbitrary 
point in X, f is continuous. 


The fact just established—that continuous mappings are precisely 
those which pull open sets back to open sets—will be of great importance 
for all our work from Chap. 3 on. 

We now come to the useful concept of uniform continuity. In 
order to explain what this is, we examine the definition of continuity 
expressed in condition (1) at the beginning of this section. Let X and 
Y be metric spaces with metrics d, and dz, and let f be a mapping of 
X into Y. We assume that f is continuous, that is, that for each point 
Zo in X the following is true: given e > 0, a number 5 > 0 can be found 
such that d,(2,20) < 6 = d2(f(x),f(ao)) <«. The reader is no doubt 
familiar with the idea that if zo is held fixed and ¢ is made smaller, then, 
in general, 5 has to be made correspondingly smaller. Thus, in the case 
of the real function f defined by f(z) = 2z, 5 can always be chosen as any 
positive number < ¢/2, and no larger 6 will do. In general, therefore, 
6 depends one. Let us return to our examination of the definition. It 
says that for our given ¢, a 6 can be found which “works” in the above 
sense at the particular point zx) under consideration. But if we hold 
e fixed and move to another point xz, then it may happen that this 6 no 
longer works; that is, it may be necessary to take a smaller 6 to satisfy 
the requirement of the definition. We see in this way that 6 may well 
depend, in general, not only on ¢e but also on x. Uniform continuity is 
essentially continuity plus the added condition that for each e we can 
find a 6 which works uniformly over the entire space X, in the sense that 
it does not depend on x. The formal definition is as follows. If X and 
Y are metric spaces with metrics d; and dz, then a mapping f of X into 
Y is said to be uniformly continuous if for each e > 0 there exists 5 > 0 
such that d,(z,z') < 6= d.(f(z),f(z’)) <e. It is clear that any uni- 
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formly continuous mapping is automatically continuous. The reader 
will observe that the above real function f defined on the entire real 
line R by f(x) = 22 is uniformly continuous. On the other hand, the 
function g defined on R by g(x) = z? is continuous but not uniformly 
continuous. Similarly, the continuous function h defined on (0,1) by 
h(x) = 1/2 is not uniformly continuous. 

Uniformly continuous mappings—as opposed to those which are 
merely continuous—are of particular significance in analysis. The fol- 
lowing theorem expresses a property of these mappings which is often 
useful. 


Theorem D. Let X be a metric space, let Y be a complete metric space, and 
let A be a dense subspace of X. If f is a untformly continuous mapping of 
A into Y, then f can be extended uniquely to a uniformly continuous mapping 
g of X into Y. 
proor. Let d; and d, be the metrics on X and Y. If A = X, the 
conclusion is obvious. We therefore assume that A ~ X. We begin 
by showing how to define the mapping g. If zis a point in A, we define 
g(x) to be f(x). Now let x be a point in X — A. Since A is dense, z 
is the limit of a convergent sequence {a,} in A. Since {a,} is a Cauchy 
sequence and f is uniformly continuous, { f(a,)} is a Cauchy sequence in Y 
(see Problem 8). Since Y is complete, there exists a point in Y—we call 
this point g(z)—such that f(a,) — g(x). We must make sure that g(x) 
depends only on z, and not on the sequence {a,}. Let {b,} be another 
sequence in A such thatb,— x. Then d;(a,,b,) — 0, and by the uniform 
continuity of f, d2(f(an),f(bn)) 20. It readily follows from this that 
f(bn) — g(z). 

We next show that g is uniformly continuous. Let « > 0 be given, 
and use the uniform continuity of f to find 6 > 0 such that for a and a’ 
in A we have di(a,a’) < 5= d.(f(a),f(a’)) <«. Let x and 2’ be any 
points in X such that d,(z,2’) < 6. It suffices to show that d2(g(z), 
g(z’)) <«. Let {an} and {a/} be sequences in A such that a,— x and 
a,— 2’, By the triangle inequality, we see that di(an,a,) < di(an,2) + 
di(z,z') + d,(z',a,). This inequality, together with the facts that 
d;(dn,2) > 0, di(z,z") < 6, and d;(z’,a/) — 0, implies that d:(an,a,) < 6 
for all sufficiently large n. It now follows that d2(f(an),f(a,)) < 
for all sufficiently large n. By Problem 12-1, 


do(9(z),g(x’)) = lim do(f(a,),f(a,)), 
and from this and the previous statement we see that d2(g(x),g(x’)) < «. 


All that remains is to show that g is unique, and this is easily proved 
by means of Problem 3 below. 
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There is an important type of uniformly continuous mapping which 
often arises in practice. If X and Y are metric spaces with metrics d; 
and ds, a mapping f of X onto Y is called an isometry (or an isometric 
mapping) if d,(z,x’) = de(f(x),f(x’)) for all points 2 and 2’ in X; and if 
such a mapping exists, we say that X is isometric to Y. It is clear 
that an isometry is necessarily one-to-one. If X is isometric to Y, then 
the points of these spaces can be put into one-to-one correspondence in 
such a way that the distances between pairs of corresponding points are 
the same. The spaces therefore differ only in the nature of their points, 
and this is often unimportant. We usually consider isometric spaces to 
be identical with one another. It is often convenient to be able to use 
this terminology in the case of mappings which are not necessarily onto. 
If f is a mapping of X into Y which preserves distances in the above 
sense, then we call f an isometry of X into Y, or an isometry of X onto 
the subspace f(X) of Y. In this situation, we often say that Y contains 
an isometric image of X, namely, the subspace f(X). 


Problems 


1. Let X and Y be metric spaces and f a mapping of X into Y. If f is 
a constant mapping, show that f is continuous. Use this to show 
that a continuous mapping need not have the property that the 
image of every open set is open. 

2. Let X be a metric space with metric d, and let 2 be a fixed point in X. 
Show that the real function f,, defined on X by f.,(x) = d(z,20) is 
continuous. Is it uniformly continuous? 

3. Let X and Y be metric spaces and A a non-empty subset of X. If 
f and g are continuous mappings of X into Y such that f(z) = g(x) 
for every z in A, show that f(x) = g(x) for every x in A. 

4. Let X and Y be metric spaces and f a mapping of X into Y. Show 
that f is continuous = f—1(F) is closed in X whenever F is closed in 
Y = f(A) C f(A) for every subset A of X. 

5. Show that any mapping of the metric space defined in Example 9-1 
into any other metric space is continuous. 

6. Consider the real function f defined on the real line R by f(x) = 2. 
If b is a given positive real number, show that the restriction of f to the 
closed interval [0,b] is uniformly continuous by starting with an 
¢ > 0 and exhibiting a 5 > 0 which satisfies the requirement of the 
definition. 

7. Determine which of the following functions are uniformly continuous 
on the open unit interval (0,1):1/(1 — 2);1/(2 — x); sin z; sin (1/2); 
xz; 23, Which are uniformly continuous on the open interval 
(0, + ©)? 
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8. In the proof of Theorem D we used the following fact: the image of 
a Cauchy sequence under a uniformly continuous mapping is again 
a Cauchy sequence. Give the details of the proof. 

9. Let f be a continuous real function defined on R which satisfies the 
functional equation f(z + y) = f(z) + f(y). Show that this function 
must have the form f(x) = mz for some real number m. (Hint: 
the subspace of rational numbers is dense in the metric space R.) 


14. SPACES OF CONTINUOUS FUNCTIONS 


In Example 9-5 we gave a brief description of the metric space 
e@[0,1]. The reader will recall that the points of this space are the 
bounded continuous real functions defined on the closed unit interval 
{0,1] and that its metric is defined by d(f,g) = sup |f(x) — g(x)|. We 
have two aims in this section: to generalize this very important example 
by considering functions defined on an arbitrary metric space, and to 
place all function spaces of this type in their proper context by giving 
the details of the structural pattern (discussed briefly in Sec. 9) which 
they all have in common with one another. We begin with the second, 
and define the algebraic systems which are relevant to our present 
interests. 

Let L be a non-empty set, and assume that each pair of elements x 
and y in L can be combined by a process called addition to yield an ele- 
ment z in L denoted by z = x+y. Assume also that this operation of 
addition satisfies the following conditions: 

(lI) tty=yta; 

(2) c+ (yt+z)= @t+y) +2; 

(3) there exists in LZ a unique element, denoted by 0 and called 

the zero element, or the origin, such that x + 0 = =z for every 2; 
(4) to each element z in ZL there corresponds a unique element 
in L, denoted by —-z and called the negative of z, such that 
z+ (—-2) =0. 
We adopt the device of referring to the system of real numbers or to the 
system of complex numbers as the scalars. We now assume that each 
scalar a and each element z in Z can be combined by a process called 
scalar multiplication to yield an element y in L denoted by y = az in 
such a way that 

(5) a(z@+y) = ax + ay; 

(6) (a+ B)z = ax + Bz; 

(7) (aB)x = a(Bz); 

(83) l-z=2z. 
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The algebraic system L defined by these operations and axioms is called 
a linear space. Depending on the numbers admitted as scalars (only the 
real numbers, or all the complex numbers), we distinguish when necessary 
between real linear spaces and complex linear spaces. For geometric 
reasons discussed in the next section, a linear space is often called a 
vector space, and its elements are spoken of as vectors. 

We are not concerned here with developing the algebraic theory of 
linear spaces. Our only interest is in making available some pertinent 
concepts and terminology which are useful as a background against which 
to view the metric spaces we wish to study. With this in mind, we men- 
tion a few simple facts which are quite easy to prove from the axioms 
for a linear space: 0 +2 = <2 for every 4} r+z2=yte>u=y 
(hint: add —z to both sides on the right); 2-0 = 0 (hint: a:0+ az = 
aO0+2)=ar=O0+ az); 0-41 =0 (hint: O:'x+ ax = 04+ a)z = 
ax =0-+ ar); and (—l)z = —z (hint: 2+ (-lx = 1:24 (-lD2z= 
(1 + (-1))z = 0:x=0). The reader will notice that in the relation 
0: a = 0 we have used the symbol 0 with two different meanings: as a 
scalar on the left and as a vector on the right. Several other meanings 
will be given to this single symbol, but fortunately it is always possible to 
avoid confusion by attending closely to the context in which it occurs. 
It is convenient to introduce the operation of subtraction by using the 
symbol x — y as an abbreviation for x + (—y); x — y is called the 
difference between x and y. 

A non-empty subset M of a linear space L is called a linear subspace 
of L if x + y isin M whenever zx and y are and if az is in M (for any 
scalar a) whenever z is. Since M is non-empty, 0-z = 0 shows that 
Oisin M. Since —z = (—1)z, —xz is in M whenever x is. It will be 
seen at once that a linear subspace of a linear space is itself a linear space 
with respect to the same operations. 

A normed linear space is a linear space on which there is defined a 
norm, i.e., a function which assigns to each element z in the space a real 
number ||z|| in such a manner that 

(1) |lzl| 2 0, and ||z|| = O02 = 0; 

(2) |lz + yll < [lel] + llylls 

(3) |lea|| = Ja! [lx]. 

In general terms, a normed linear space is simply a linear space in which 
there is available a satisfactory notion of the distance from an arbitrary 
element to the origin. From (3) and the fact that —z = (—1)z, we 
obtain ||—z|| = ||z||. As we saw in Sec. 9, a normed linear space is a 
metric space with respect to the induced metric defined by 


d(x,y) = |le — yll- 


A Banach space is a normed linear space which is complete as a metric 
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space. By Theorem 12-B, any closed linear subspace of a Banach space 
is itself a Banach space with respect to the same algebraic operations and 
the same norm. 

So much for the technical framework. We now turn to the metric 
spaces which really concern us. They are all function spaces, in the 
sense that they are linear spaces whose elements are functions defined 
on some non-empty set X with addition and scalar multiplication defined 
pointwise, ie., by (f + g)(x) = f(x) + g(x) and (ef)(x) = of(x). We 
note that the zero element in such a linear space is the constant function 
0 whose only value is the scalar 0 and that (—f)(z) = —f(z). 

Suppose, then, that X is an arbitrary non-empty set, and consider 
the set L of all real functions defined on X. It is clear that L is a real 
linear space with respect to the operations described above. We now 
restrict ourselves to the subset B consisting of the bounded functions in 
L. B is obviously a linear subspace of L, so it is a linear space in its 
own right. Even more, if we define a norm on B by ||f|| = sup |f(z)l, 
then B is a Banach space (see Problems 9-5 and 12-4). 

We next assume that the underlying set X is a metric space. This 
enables us to consider the possible continuity of functions defined on 
X. We define C(X,R) to be that subset of B which consists of con- 
tinuous functions. C@(X,R) is thus the set of all bounded continuous 
real functions defined on the metric space X, and it is non-empty by 
Problem 13-1. 


Lemma. If f and g are continuous real functions defined on a metric space 
X, then f + g and of are also continuous, where a is any real number. 

PRooF. Letdbe the metricon X. Weshow that f + gis continuous by 
showing that it is continuous at an arbitrary point x in X. Lete>0 
be given. Since f and g are continuous, and thus continuous at xo, we 
can find 6, > 0 and 62 > 0 such that d(x,20) < 6: => |f(z) — f(x0)| < «/2 
and d(2,20) < d= |g(x) — g(ao)| < «/2. Let 5 be the smaller of the 
numbers 6; and 42. Then the continuity of f +g at x follows from 


d(a,to) < §=>|(f + g)(z) — (F + g)(xo)| = ILf(x) + g(z)] — [f (ao) 
+ g(xo)]| = |Lf(z) — f(xo)] + (g(x) — g(ao)]| < If(z) — f(xo)| 
+ |g(x) — g(xo)| < €/2 + €/2 =«. 


We leave it to the reader to show similarly that af is continuous. 


This lemma implies that C(X,R) is a linear subspace of the linear 
space B. We next prove that it is closed as a subset of the metric 
space B. 
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Lemma. C(X,R) is a closed subset of the metric space B. 

PROOF. Let f be a function in B which is in the closure of C(X,R). We 
show that f is continuous, and therefore is in C(X,R), by showing that it is 
continuous at an arbitrary point 2) in X. Since a set which equals its 
closure is closed, this will suffice to prove the lemma. Let d be the 
metric on X, and let e > 0 be given. Since f is in the closure of C(X,R), 
there exists a function fy in C(X,R) such that ||f — foll < ¢/3, from which 
it follows that |f(z) — fo(x)| < ¢/3 for every point x in X. Since fo 
is continuous, and hence continuous at 29, we can find a 6 > 0 such that 
d(z,20) < 6 = |fo(xz) — fo(xo)| < «/3. The fact that f is continuous at 
2» now follows from 


d(x,00) < => |f(c) — f(eo)| = \IF(@) — fo(x)] + (fol) — fo(xo)] 
+ [fo(ao) — f(xo)]] < |f(z) — fo(x)| + |fo(z) — fo(zo)| 
+ |fo(ao) — f(xo)| < «/8 + €/8 + €/8 =e. 


Since a closed lincar subspace of a Banach space is itself a Banach 
space, we can summarize the result of the above discussion and lemmas 
in the following theorem. 


Theorem A. The set C(X,R) of all bounded continuous real functions 
defined on a metric space X is a real Banach space with respect to pointwise 
addition and scalar multiplication and the norm defined by ||f|| = sup |f(x)|. 


It is desirable at this stage to make a clear distinction between two 
types of convergence for sequences of functions. Let X be a metric 
space, and let {f,} be a sequence of real functions defined on X. If for 
each x in X it happens that {f,(x)} isa Cauchy sequence of real numbers, 
then by the completeness of the real number system we can define a limit 
function f by f(x) = lim f,(z). We then say that f, converges pointwise 
to f, or that f is the pointwise limit of f,. It is often important to know 
what properties of the functions f, carry over to the limit function f, but 
unless we strengthen the mode of convergence, very little can be said 
along these lines. The stronger type of convergence normally needed to 
conclude anything of interest is called uniform convergence. In order to 
explain what this is, we inspect a little more closely what is involved in 
pointwise convergence. To say that f, converges pointwise to f is to say 
the following: for each point z in X, if e > 0 is given, then a positive 
integer no can be found such that |f,(x) — f(z)| < efor alln > mo. In 
general, the integer no may depend on z as well ase. If, however, for 
each given «an integer n) can be found which serves for all points z, then we 
say that f, converges uniformly to f, or that f is the uniform limit of fr. 
The reader will observe that these concepts are quite independent of the 
assumption that X is a metric space and that they are meaningful for 
functions defined on an arbitrary non-empty set. 
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It will be seen at once that convergence in the function space C(X,R) 
is precisely uniform convergence as we have just defined it. The fact 
that C(X,R) is complete can be restated as follows in the language of 
uniform convergence: if a bounded real function f defined on X is the 
uniform limit of a sequence {f,} of bounded continuous real functions 
defined on X, then f is also continuous. In other words, in the presence 
of uniform convergence, continuity carries over from the f,’s to the limit 
function f. 

A moment’s thought will convince the reader that the entire discus- 
sion given above, beginning with our definition of the linear space L, 
could perfectly well have been based on complex functions. Without 
going again through all the details, we state the following theorem and 
consider it proved. 


Theorem B. The set C(X,C) of all bounded continuous complex functions 
defined on a metric space X is a complex Banach space with respect to point- 
wise addition and scalar multiplication and the norm defined by 


fll = sup |f(2)}. 


In summary, we associate with each metric space X two linear spaces 
of continuous functions defined on X. The first—@C(X,R)—contains 
only real functions, and the second—C(X,C)—consists of complex 
functions. Further, all functions considered are assumed to be bounded, 
so that the norm defined by ||f|| = sup |f(x)| is always a real number. 
In the special case in which X is a closed interval [a,b] on the real line, we 
write C(X,) in the simpler form @[a,b]. 


Problems 


1. Show that a non-empty subset A of a Banach space is bounded © 
there exists a real number K such that ||z|| < K for every x in A. 

2. Construct a sequence of continuous functions defined on [0,1] which 
converges pointwise but not uniformly to a continuous limit. 

3. Construct a sequence of continuous functions defined on [0,1] which 
converges pointwise to a discontinuous limit. 

4. Let X and Y be metric spaces with metrics d, and dz, and let {f,} bea 
sequence of mappings of X into Y which converges pointwise to a 
mapping f of X into Y, in the sense that f,(x) — f(x) for each x in X. 
Define what ought to be meant by the statement that f, converges 
uniformly to f, and prove that under this assumption f is continuous 
if each f, is continuous. 

5. In this problem we give a procedure for constructing the completion 
X* of an arbitrary metric space X. Denote by d the metric on X. 
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Let xo be a fixed point in X, and to each point x in X make correspond 

the real function f, defined on X by f:(y) = d(y,z) — d(y,20). 

(a) Show that f, is bounded. (Hint: |f.(y)| < d(x,2z0).) 

(b) Showthatf,iscontinuous. (Hint:|f.(y1) — fe(y2)| < 2d(y1,y2).) 

By (a) and (6), the mapping F defined by F(z) = f, is a mapping of 

X into C(X,R). 

{c) Show that F is an isometry. (Hint: |f.,(y) —fa(y)| < 
d(x ,22).) 

F ig thus an isometry of X into the complete metric space 

C(X,R). We define the completion X* of X to be the closure of 

F(X) in C(X,R). 

(d) Show that X* is a complete metric space which contains an 
isometric image of X. 

(e) Show that there is a natural isometry of X* into any complete 
metric space Y which contains an isometric image of X (to say 
that an isometry of X* into Y is “natural” means that the 
image of a point in X* which corresponds to a point in X is the 
point in Y which corresponds to this same point in X). 

(f) Show that (d) and (e) characterize X* in the following sense: 
if Z is a complete metric space which contains an isometric 
image of X, and if there is a natural isometry of Z into any 
complete metric space Y which contains an isometric image of 
X, then there is a natural isometry of Z onto X*. 

(g) Show that if X occurs as a subspace of a complete metric space, 
then there is a natural isometry of the closure of X onto X*. 

(h) Show that there is a natural isometry of any complete metric 
space which contains X as a dense subspace onto X*.! 


15. EUCLIDEAN AND UNITARY SPACES 


Let n be a fixed positive integer, and consider the set R* of all ordered 
n-tuples x = (21, %2, . . . , Zn) of real numbers.? We promised in Sec. 4 
to make this set into a space, and we are now in a position to do so. 


1 The construction outlined in (a) to (c) clearly depends on the initial choice of the 
fixed point x. If another fixed point zy is chosen, then another isometry F of X into 
e(X,R) is determined. It would seem, therefore, that there is little justification for 
calling the particular X* defined in this problem the completion of X. In practice, 
however, we usually pursue the reasonable course of regarding isometric spaces as 
essentially identical. From this point of view, the X* defined here is a complete 
metric space which contains X as a dense subspace; and since by (h) it is the only 
complete metric space with this property, it is natural to call it the completion of X. 

2 From this point on, we omit the adjective “ordered.’’ It is to be understood 
that an n-tuple is always ordered. 
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We begin by defining addition and scalar multiplication in R*. 
Ifz = (t1,22, ... ,@n) andy = (y1, y2, . - . , Yn), then we define x + y 
and ax (where a is any real number) by 


xty = (i+ y1, 22+ YY, . ~~ >on + Yn) 
and ax = (a%1, a%2, .. . , @%n). 


With the algebraic operations defined coordinatewise in this way, R" is a 
real linear space. The origin or zero element is clearly 0 = (0,0, . . . ,0), 
and the negative of an element 2 = (2, %2, . . . , %n) is 


—a = (-—%1, —%2, . . . , —2n). 


When we speak of R* as an n-dimensional space, all we mean at this stage 
is that each element x = (21, Z2, . . . , Xn) is the ordered array of its n 
coordinates X1, 22, . . . , En. 

The reader is probably familiar with vector algebra in the ordinary 
three-dimensional space of our physical intuition. If so, then he is 


z axis 


x axis 


Fig. 21. A vector (or point) in ordinary space. 


accustomed to regarding a point in this space as being essentially identical 
with the arrow (or vector) from the origin to that point, in the sense that 
given the point, the vector is determined, and given the vector, the point 
isdetermined. This situation is illustrated in Fig. 21. The above defini- 
tions of addition and scalar multiplication in R* correspond to vector 
addition and the multiplication of a vector by a real number. A word of 
warning must be given. In ordinary vector algebra, a vector is usually 
allowed to have its tail at any point in the space and its head at any 
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other point. It should be clearly understood, however, that for us a 
vector always has its tail at the origin. In accordance with this intuitive 
picture, we may think of the elements of the real linear space R* either 
as points or as generalized vectors from the origin to those points. The 
latter view is often more fruitful and illuminating. 

There is yet a third interpretation of the elements of R*, of great. 
significance from the point of view of generalizations. An n-tuple 
x = (21,22, . . . , Zn) of real numbers can be thought of as a real function 
f defined on the set {1, 2, .. . , n} of the first n positive integers. The 
ith coordinate 2; of z is then just the value of this function at the integer 
i (f@) = 2), and the coordinatewise operations defined above become 
pointwise operations. This way of thinking about the elements of R” 
should help to allay any doubts which might be felt as to the feasibility 
of visualizing n-dimensional spaces for n > 4. The four-dimensional 
space 4, for instance, is merely the space of all real functions defined on 
the set consisting of the first four positive integers, and there is surely 
nothing mysterious or incomprehensible about this. The advantages of 
the function notation are so great that we shall often (but not always) 
use it in preference to the n-tuple notation. The reader will find it 
profitable to keep in mind all three aspects of the elements of R"—as 
points, as vectors, and as functions—and he will train himself to use that 
interpretation (and notation) which appears most natural in any given 
situation. 

Our next task is to define a suitable norm on the linear space R*. 
We recall that in solid analytic geometry the usual distance from a point 
(x, y, 2) to the origin (see Fig. 21) is given by the expression 


Vite TA 


If x = (a1, t2, . . . , tn) is an arbitrary element of R*, then it is natural 
to define ||z\|—the distance from the point z to the origin, or the length of 
the vector x—by 


llzll = Vaal? + fae]? +--+ + [aa]? 
= (Sry 


If we think of R* as composed of real functions f defined on {1,2,..., 
n}, then this definition becomes 


Isl = ( > lary. 


This is called the Euclidean norm on R*, and the real linear space FR” 
normed in this way is called n-dimensional Euclidean space. The 
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Euclidean plane is the real linear space 2? with its Euclidean norm; that is, 
it is the coordinate plane equipped with the above algebraic operations 
and the above norm. For reasons which will appear a little later, we 
observe that our formula defining ||z|] can be applied equally well to 
n-tuples of complex numbers. 

We have not yet proved, of course, that the above expression for 
\|z|| possesses the three properties required by the definition of a norm. 
The first and third of these conditions are clearly satisfied. The second, 
namely, that 


Ilz + yll < llzll + tlyll, 


is another matter. We prove this by the following two lemmas, of which 
the first is essentially a tool used in the proof of the second. 


Lemma (Cauchy’s Inequality). Let 2 = (21, m2, .. . , 2n) andy = (y1, 
Yo, . - + 5 Yn) be two n-tuples of real or complex numbers. Then 


> lea < (> Inf?) *( > Ive?) 4, 


or, in our notation, Zi, |zys| < [zl] |lyll. 

PRooF. We first remark that if a and 6 are any two non-negative real 
numbers, then a%b¥% < (a + b)/2; for on squaring both sides and rear- 
ranging, this is equivalent to 0 < (a — b)*, which is obviously true. 
If « = 0 or y = 0, the assertion of the lemma is clear. We therefore 
assume that x * 0 and y #0. We define a, and 0}; by a; = ((|z;|/]x||)? 
and b; = (|y:|/|ly||)?.. By the above remark, we obtain the following for 
each 1: 


[cys] leal?/Ihall? + lysl?/tlyll?, 
Hall Hlyll ~~ 2 


Summing these inequalities as 7 varies from 1 to n yields 


> lays 1 i 


oo << —_— = 1 

Hl] lly] ~ 2 ; 
from which our conclusion follows at once. 
Lemma (Minkowski’s Inequality). Let x = (11, t2,..., %n) and 
y = (41, Yo, - » - + Yn) be two n-tuples of real or complex numbers. Then 


(Slee + wh) < (3 lo) + (S wy 


or, in our notation, ||z + y|| < |lzl| + llyll- 
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Proor. Using Cauchy’s inequality, we have the following chain of 
relations: 


lle+ yll? = > lzs + ysl fae + yl 


<¥ lax + yal(leal + luel) 


+ yll lel] + lz + yl Ilyll 
+ yll(lizll + Hlyll), 
lz + yll? < [lz + yll(lzll + llyl). 


If ||z + y|| = 0, our lemma is trivially true; otherwise, it follows from 
the inequality last written on dividing through by |]z + yl. 


=z | ai + yal lee + > [ee + al ly 
=] tm 
< 2 
= |lz 


or summarizing, 


We are now in a position to state 


Theorem A. The set R" of all n-tuples x = (x1, 42, ..., tn) of real 
numbers is a real Banach space with respect to a arsenite addition and 
\.scalar multiplication and the norm defined by ||x|| = (271 |zil?)¥. 


PROOF. In view of the above discussions, all a remains is to prove 
completeness. It will be convenient here to use the function notation, so 
that a typical element of our space is regarded as a real function defined 
on {1, 2,...,m}. Let {f,} be a Cauchy sequence in R*. If e > 0 
is given, then for all sufficiently large m and m’ we have ||fm — fm’|| < «, 
lfm — fm ||? < &, and 224 |fm(t) — fm(t)|? < e; and from this we see 
that |fm(t) — fm (| < efor each 7 (and all sufficiently large m and m’). 
The sequence {f,,} therefore converges pointwise to a limit function f 
defined by f(z) = lim f,(¢). Since the set {1, 2, . . . , n} is finite, this 
convergence is uniform. We can thus find a positive integer mp such 
that |fn(t) — f(t)| < e/n* for all m > mp and every i. Squaring each 
of these inequalities and summing as i varies from 1 to 7 yields 
Zhi |fm(t) — f(i)|? < @ or |lfn —f|| <¢ for all m > mo. This shows 
that the Cauchy sequence { f,,} converges to the limit f, so 2" is complete. 


Just as in the previous section, virtually every statement we have 
made about n-tuples of real numbers (or about real functions defined on 
{1, 2, ... , n}) has its complex analogue. We therefore consider the 
following theorem to be fully proved. 


Theorem B. The set C* of all n-tuples z = (21, 22, . . . , @n) of complex 
numbers is a complex Banach space with respect to coordinatewise addi- 
tion and scalar multiplication and the norm defined by |lz|| = (221 |e!*)*. 
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The space C*, with these algebraic operations and this norm, is 
called n-dimensional unitary space. Needless to say, it can equally well 
be viewed as the set of all complex functions f defined on the set {1, 2, 
..., ”}, with addition and scalar multiplication (by complex scalars) 
defined pointwise and the norm defined by ||f|| = (22, |f(2)|?)¥. 

The four spaces defined and discussed in this and the previous sec- 
tion—C(X,R) and @(X,C), and R* and C*—form the foundation for all 
our future work. In Chaps. 3 to 7 we generalize the first two by loosening 
the restrictions on the underlying space X. In Chaps. 9 to 11 we study 
all four from a wider point of view, with special emphasis on C". And 
in the last three chapters we pull these lines of development together 
in such a way that each aspect of our work sheds light on all the others. 


Problems 


1. Show that a non-empty subset A of R* is bounded «© there exists a 


real number K such that for each x = (x1, 22, ..., %n) in A we 
have |z,| < K for every subscript 7. 
2. Let X be the set {1, 2, . . . , m}, equipped with the metric defined 


in Example 9-1. Then C(X,R) and R* are two Banach spaces which 
are essentially identical as real linear spaces but which have different 
norms. Show that they have the same open sets. 

3. Prove the following extension of Minkowski’s inequality. If 
x= {m1, t2,...,2n,...$ and y = {y1, yz, ..-, Yn, .- -$ are 
two sequences of real or complex numbers such that 27_, |z,|? and 
Z7_1 [yal? are convergent, then Z7_; |ta + yn|? is also convergent, and 


(S [zn + ynl?)* < (> Jara?) + (> lynl?)*. 


This statement is also called Minkowski’s inequality—for infinite 
sums. 

4. The set of all sequences x = {x1, t2, ...,2n, .. .} of real num- 
bers such that 27_, |z,|? converges is denoted by R». If addition 
and scalar multiplication are defined coordinatewise (or termwise), 
and if a norm is defined by |lz|| = (22_, |z.|?)4, show that R@ is a 
real Banach space. R* is called infinite-dimensional Euclidean 
space. The infinite-dimenstonal unitary space C is defined similarly, 
and is a complex Banach space. 


CHAPTER THREE 


Copological Spaces 


In the previous chapter we defined the concept of a continuous 
mapping of one metric space into another, and this definition was for- 
mulated in terms of the metrics on the spaces involved. It often hap- 
pens, however, that it is convenient—even essential—to be able to 
speak of continuous mappings in situations where no useful metrics are 
defined, readily definable, or capable of being defined. In order to deal 
effectively with circumstances of this kind, it is necessary for us to liberate 
our concept of continuity from its dependence on metric spaces. 

Theorem 13-C shows that the continuity of a mapping of one metric 
space into another can be expressed solely in terms of open sets, without 
any direct reference to metrics. This suggests the possibility of dis- 
carding metrics altogether and of replacing them as the source of our 
theory by open sets. With this in mind, our attention is drawn to 
Theorem 10-D, which gives the main internal properties of the class of 
open sets in a metric space. These two theorems provide the leading 
hint on which we base our generalization of metric spaces to topological 
spaces—a, topological space being simply a non-empty set in which there 
is given a class of subsets, called open sets, with the properties expressed 
in Theorem 10-D. 

Our underlying purpose in this and the next four chapters is to study 
topological spaces and continuous mappings of topological spaces into 
one another. We shall see that these spaces provide the ideal context 
for a theory of continuity in its purest form. 

This chapter is devoted primarily to explaining the concept of a 
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general topological space. We also construct some machinery which will 
be useful in the detailed study of these spaces. 

Our main special interest in the four chapters that follow will be in 
continuous real or complex functions defined on particular types of 
topological spaces, and we shall develop the point of view that there is a 
constant illuminating interplay between the structure of these spaces 
and the properties of the continuous functions which they carry. 


16. THE DEFINITION AND SOME EXAMPLES 


Let X be a non-empty set. A class T of subsets of X is called a 
topology on X if it satisfies the following two conditions: 

(1) the union of every class of sets in T is a set in T; 

(2) the intersection of every finite class of sets in T is a set in T. 

A topology on X is thus a class of subsets of X which is closed under the 
formation of arbitrary unions and finite intersections. A topological space 
consists of two objects: a non-empty set X and a topology Ton X. The 
sets in the class T are called the open sets of the topological space (X,T), 
and the elements of X are called its points. It is customary to denote 
the topological space (X,T) by the symbol X which is used for its under- 
lying set of points. No harm can come from this practice if one clearly 
understands that a topological space is more than merely a non-empty 
set: it is a non-empty set together with a specific topology on that set. 
We shall often be considering several topologies on a single given set, 
and in these circumstances distinct topologies make the set into distinct 
topological spaces. We observe that the empty set and the full space 
are always open sets in every topological space, since they are the union 
and intersection of the empty class of sets, which is a subclass of every 
topology. 

We now list several simple examples of topological spaces. In order 
to exhibit a topological space, one must specify a non-empty set, tell 
which subsets are to be considered the open sets, and verify that this 
given class of sets satisfies conditions (1) and (2) above. In the exam- 
ples which follow, we leave this third step to the reader. 


Example 1. Let X be any metric space, and let the topology be the class 
of all subsets of X which are open in the sense of the definition in Sec. 10. 
This is called the usual topology on a metric space, and we say that these 
sets are the open sets generated by the metric on the space. Metric 
spaces are the most important topological spaces, and whenever we 
speak of a metric space as a topological space, it is understood (unless 
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we say something to the contrary) that its topology is the usual topology 
described here. 


Example 2. Let X be any non-empty set, and let the topology be the 
class of all subsets of X. This is called the discrete topology on X, and 
any topological space whose topology is the discrete topology is called a 
discrete space. 


Example 3. Let X be any non-empty set, and let the topology consist 
only of the empty set 6 and the full space X. This topology is at the 
opposite extreme from that described in Example 2, but they coincide 
when X is a set with only one element. 


Example 4. Let X be any infinite set, and let the topology consist of 
the empty set @ together with all subsets of X whose complements are 
finite. 


Example 5. Let X be the three-element set {a, b, c}, and let the topology 
consist of the following subsets of X: 9, {a}, {a,b}, {a,c}, X. Spaces of 
this type serve mainly to illustrate certain aspects of the theory which 
will emerge in later chapters. 


A metrizable space is a topological space X with the property that 
there exists at least one metric on the set X whose class of generated 
open sets is precisely the given topology. A metrizable space is thus a 
topological space which is—so far as its open sets are concerned—essen- 
tially a metric space. We shall encounter many important topological 
spaces which are not metrizable, and it is the existence of such spaces 
which gives our present theory a wider scope than the theory of metric 
spaces. It is a problem of considerable interest to determine what 
types of topological spaces are metrizable, and we shall return to this 
question in Sec. 29. 

Let X be a topological space, and let Y be a non-empty subset of X. 
Problem 10-7 suggests a natural way of making Y into a topological space. 
The relative topology on Y is defined to be the class of all intersections 
with Y of open sets in X; and when Y is equipped with its relative 
topology, it is called a subspace of X. 

Let X and Y be topological spaces and f a mapping of X into Y. 
f is called a continuous mapping if f-1(G) is open in X whenever G is open 
in Y, and an open mapping if f(G) is open in Y whenever G is open in X. 
A mapping is continuous if it pulls open sets back to open sets, and open 
if it carries open sets over to open sets. Any image f(X) of a topological 
space X under a continuous mapping f is called a continuous image of X. 

A homeomorphism is a one-to-one continuous mapping of one topologi- 
cal space onto another which is also an open mapping. Two topological 
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spaces X and Y are said to be homeomorphic if there exists a homeo- 
morphism of X onto Y (and in this case, Y is called a homeomorphic image 
of X). If X and Y are homeomorphic, then their points can be put into 
one-to-one correspondence in such a way that their open sets also corre- 
spond to one another. The two spaces therefore differ only in the nature 
of their points, and can, from the point of view of topology, be considered 
essentially identical. 

We have just used the word topology in its primary sense, as the name 
of a branch of mathematics. This word derives from two Greek words, 
and its literal meaning is ‘the science of position.’”” How can we account 
for this? Let us say that a topological property is a property which, if 
possessed by a topological space X, is also possessed by every homeo- 
morphic image of X. The subject of topology can now be defined as the 
study of all topological properties of topological spaces. If, very roughly, 
we think of a topological space as a general type of geometric configura- 
tion, say, a diagram drawn on a sheet of rubber, then 2 homeomorphism 
may be thought of as any deformation of this diagram (by stretching, 
bending, etc.) which does not tear the sheet. A circle can be deformed 
in this way into an ellipse, a triangle, or a square, but not into a figure 
eight, a horseshoe, or a single point. A topological property would then 
be any property of the diagram which is invariant under (or unchanged 
by) such a deformation. Distances, angles, and the like, are not topo- 
logical properties, because they can be altered by suitable “non-tearing”’ 
deformations. What sorts of properties are topological? In the case of 
the circle, the fact that it has one “‘inside’”’ and one “outside” (a point 
has no inside, and a figure eight has two). Also, the fact that when two 
points are removed from a circle it falls into two pieces, whereas if only 
one point is removed, then the circle remains in one piece. These 
remarks may suffice to indicate why topology is often described to 
non-mathematicians as “rubber sheet geometry.” For an excellent 
nontechnical discussion of topology from this geometric point of view, see 
Courant and Robbins [6, chap. 5]. 


Problems 


1. Let Ti and T. be two topologies on a non-empty set X, and show 
that Ti % Tz? is also a topology on X. 

2. Let X be a non-empty set, and consider the class of subsets of X 
consisting of the empty set @ and all sets whose complements are 
countable. Is this a topology on X? 

3. Which topological spaces given as examples in the text are metriza- 
ble? (Hint: if a space is metrizable, then its open sets must have 
certain properties.) 
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4. Show that if a topological space is metrizable, then it is metrizable 
in an infinite number of different ways (i.e., by means of an infinite 
number of different metrics). 

5. Show that a subspace of a topological space is itself a topological 
space. 

6. Let X be a topological space, and let Y and Z be subspaces of X such 
that Y C Z. Show that the topology which Y has as a subspace of 
X is the same as that which it has as a subspace of Z. 

7. Let f be a continuous mapping of a topological space X into a 
topological space Y. If Z is a subspace of X, show that the restric- 
tion of f to Z is continuous. 

8. Let X and Y be topological spaces, and f a mapping of X into Y. 
Show that f is continuous © it is continuous as a mapping of X onto 
the subspace f(X) of Y. 

9. Let X, Y, and Z be topological spaces. If f:X — Y andg: YZ 
are continuous mappings, show that gf:X — Z is also continuous. 

10. Let f be a one-to-one mapping of one topological space onto another, 
and show that f is a homeomorphism > both f and f~! are continuous. 

11. Give an example to show that a one-to-one continuous mapping of 
one topological space onto another need not be a homeomorphism. 
(Hint: consider Examples 2 and 3.) 

12. Show that a topological space X is metrizable — there exists a 
homeomorphism of X onto a subspace of some metric space Y. 

13. If X and Y are topological spaces, let X ~ Y mean that X and Y are 
homeomorphic. Show that this relation is reflexive, symmetric, 
and transitive. 


17, ELEMENTARY CONCEPTS 


We have taken open sets as the starting point in our development of 
topology, and we now define a number of other basic concepts in terms of 
open sets. Most of these will be familiar to the reader from the previous 
chapter, and he will observe that in every case the definition given here is 
a strict generalization of our earlier definition or some equivalent form 
of it. 

A closed set in a topological space is a set whose complement is open. 
The following theorem is an immediate consequence of Eqs. 2-(2) and the 
assumed properties of open sets. 


Theorem A. Let X be a topological space. Then (1) any intersection of 
closed sets in X is closed; and (2) any finite union of closed sets in X 1s closed. 


By considering the empty class of closed sets, we see at once that the 
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empty set and the full space—its union and intersection—are always 
zlosed sets in every topological space. 

If A is a subset of a topological space, then its closure (denoted by A) 
ig the intersection of all closed supersets of A. It is easy to see that the 
closure of A is a closed superset of A which is contained in every closed 
superset of A, and that A is closed + A =A. A subset A of a topo- 
logical space X is said to be dense (or everywhere dense) if A = X, and X is 
called a separable space if it has a countable dense subset. For reasons 
which will become clear at the end of this section, we summarize the main 
facts about the operation of forming closures in the following theorem. 
Its proof is a direct application of the above statements. 


Theorem B. Let X be a topological space. If A and B are arbitrary 
subsets of X, then the operation of forming closures has the following four 
properties: (1) 6 = @; (2) A C A; (3) A = A; and (4) AO B=AUB. 


A neighborhood of a point (or a set) in a topological space is an open 
set which contains the point (or the set). A class of neighborhoods of a 
point is called an open base for the point (or an open base at the point) if 
each neighborhood of the point contains a neighborhood in this class. 
In the case of a point in a metric space, an open sphere centered on the 
point is a neighborhood of the point, and the class of all such open 
spheres is an open base for the point. Our next theorem gives a useful 
characterization (in terms of neighborhoods) of the closure of a set. 


Theorem C. Let X be a topological space and A an arbitrary subset of X. 
Then A = {x:each neighborhood of x intersects A}. 

prooF. We begin by proving that A is contained in the given set (the 
set on the right) by showing that any point not in the given set is not in A. 
Let z be a point with a neighborhood which does not intersect A. Then 
the complement of this neighborhood is a closed superset of A which 
does not contain z, and since A is the intersection of all closed supersets of 
A,zisnotin A. Inthe same way, it can easily be shown that A contains 
the given set. 


Let X be a topological space and A a subset of X. A point in A is 
called an isolated point of A if it has a neighborhood which contains no 
other point of A. A point xin X is said to be a limit point of A if each of its 
neighborhoods contains a point of A different from z. The derived set 
of A—denoted by D(A)—is the set of all limit points of A. 


Theorem D. Let X be a topological space and A a subset of X. Then 
(1) A = A U D(A); and (2) A ts closed = A 2 D(A). 

Proor. To prove (1), we use Theorem C to show that any point not 
in one side is also not ir the other. If z is not in A, then it has a neigh- 
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borhood disjoint from A, so it is not in A or D(A); and if x is not in A or 
D(A), then it has a neighborhood disjoint from A, so it is not in A. 

We prove (2) as follows. If A is closed, so that A = A, then by 
(1) A = AU D(A), from which we see that A D D(A); andif A > D(A), 
so that A U D(A) = A, then by (1) we have A = A, so A is closed. 


By the above definitions, a point in a set is either an isolated point 
of the set or a limit point of the set, but not both. This fact leads to the 
following obvious but rather satisfying theorem. 


Theorem E. Let X be a topological space. Then any closed subset of X 
ts the disjoint union of its set of isolated points and tts set of limit points, in 
the sense that it contains these sets, they are disjoint, and it ts their union. 


Let X be a topological space and A a subset of X. The interior of A 
[denoted by Int(A)] is the union of all open subsets of A, and a point in 
the interior of A is called an interior point of A. It is clear that the 
interior of A is an open subset of A which contains every open subset 
of A, and that A is open = A = Int(A). Also, a point in A is an 
interior point of A = it has a neighborhood which is contained in A. 
The boundary of A is A (\ A’, and a point in the boundary of A is called 
a boundary point of A. It follows at once from the definition that the 
boundary of A is a closed set, and that it consists of all points x in X with 
the property that each neighborhood of z intersects both A and A’. 

It is easy to see from the neighborhood characterizations of the 
interior and boundary that a point in a set is an interior point of the set 
or a boundary point of the set, but not both. This immediately yields 
the following theorem, which serves to validate our feeling about the 
intuitive significance of interiors and boundaries. 


Theorem F. Let X be a topological space. Then any closed subset of X 
ts the disjoint union of its interior and its boundary, in the sense that it 
contains these sets, they are disjoint, and it ts their union. 


In defining a topological space, we chose ‘‘open set” as our primitive 
undefined term. Our next theorem shows that ‘‘closed set’? would have 
served just as well. 


Theorem G. Let X be a non-empty set, and let there be given a class of sub- 
sets of X which is closed under the formation of arbitrary intersections and 
finite unions. Then the class of all complements of these sets is a topology 
on X whose closed sets are precisely those initially given. 

prooF. This follows immediately from Eqs. 2-(2), the definition of a 
topology, and the definition of a closed set. 


As the following theorem shows, we could even have taken the term 
“closure” as our undefined concept. 
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Theorem H. Let X be a non-empty set, and let there be given a “closure” 
operation which assigns to each subset A of X a subset A of X in such a 
manner that (1) 6 = @, (2) A C A, (8) A = A, and (4)4 OB = AUB. 
If a “‘closed’’ set A is defined to be one for which A = A, then the class 
of all complements of such sets is a topology on X whose closure operation ts 
precisely that initially given. 

proor. In view of Theorem G, it suffices to demonstrate two facts: 
that the class of all “closed” sets is closed under the formation of arbitrary 
intersections and finite unions; and that for any set A, A equals the 
intersection of all ‘“‘closed” supersets of A. 

By (1), the empty set is “‘closed,” and from this and (4) we see that 
any finite union of ‘“‘closed” sets is ‘‘closed.” By (2), the full space X is 
“closed,’’ so all that remains in the first part of our proof is to show that 
if {A:} is a non-empty class of sets such that A; = A; for every i, then 
1\.A; = OA; By (2), it suffices to prove that A.A; COA; For this, 
it suffices to show that A C B= A C B (since (1) ,A; C A, for each i, it 
will follow that ™\;A; G A; = A; for each 7, from which we see that 
(\;A; GO A,). Assume that AC B. Then B = A UB, and by (4), 
B=AUB=AUBorACB. 

We now let A be an arbitrary subset of X, and we show that A equals 
the intersection of all ‘‘closed” supersets of A. By (2) and (3), A isa 
“closed” superset of A, so it suffices to show that if A C B and B = B, 
then AC B. Since ACB, B=AVWB. By (4) and our assumption 
that B = B, weobtainB = B=AUWB=AUB=AWUB,s904A CB. 


The four properties of the closure operation assumed in this theorem 
are called the Kuratowski closure axioms. The last two theorems show 
that it is possible to approach the subject of topological spaces by taking 
either closed sets or a closure operation as the basic undefined concept. 
A good deal of research was done along these lines in the early days of 
topology. It was found that there are many different ways of defining a 
topological space, all of which are equivalent to one another. Several 
decades of experience have convinced most mathematicians that the 
open set approach is the simplest, the smoothest, and the most natural. 


Problems 


1. Let f:X — Y be a mapping of one topological space into another. 
Show that f is continuous = f—'(F) is closed in X whenever F is 
closed in Y = f(A) C f(A) for every subset A of X. 

2. Let X be a topological space, Y a metric space, and A a subspace 
of X. If f is a continuous mapping of A into Y, show that f can be 
extended in at most one way to a continuous mapping of A into Y. 
(Hint: see Problem 13-3.) 
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3. Show that a subset of a topological space is dense = it intersects 
every non-empty open set. 

4. Let A be a non-empty subset of a topological space, and show that 
A is dense as a subset of the subspace A. 

§. A subset A of a topological space is called a perfect set if 
A = D(A). Show that a set is perfect = it is closed and has no 
isolated points. Show that the Cantor set is perfect. 

6. Show that Int(A’) = A’ for every subset A of a topological space. 

7. Show that a subset of a topological space is closed — it contains its 
boundary. 

8. Show that a subset of a topological space has empty boundary © it is 
both open and closed. (Every topological space X has the property 
that the empty set § and the full space X are both open and closed. 
In Chap. 6 we study the hypothesis that these are the only subsets of 
X which are both open and closed.) 

9. A-subset A of a topological space is said to be nowhere dense if A has 
empty interior. 

(a) Show that a set A is nowhere dense — every non-empty open set 
has a non-empty open subset disjoint from A. 

(b) Show that a closed set is nowhere dense — its complement is 
everywhere dense. Is this true for an arbitrary set? 

(c) Show that the boundary of a closed set is nowhere dense. Is 
this true for an arbitrary set? 


18. OPEN BASES AND OPEN SUBBASES 


A special role is played in the theory of metric spaces by the class of 
open spheres within the class of all open sets. The main feature of their 
relationship is that the open sets coincide with all unions of open spheres, 
and it follows from this that the continuity of a mapping can be expressed 
either in terms of open spheres or in terms of open sets, at our convenience. 
We now develop similar machinery for topological spaces. 

Let X be a topological space. An open base for X is a class of open 
sets with the property that every open set is a union of sets in this class. 
This condition can also be expressed in the following equivalent form: if 
G is an arbitrary non-empty open set and z is a point in G, then there 
exists a set B in the open base such thatze B CG. The sets in an open 
base are referred to as basic open sets. It is clear that the class of open 
spheres in a metric space is an open base, and also that any class of open 
sets which contains an open base is itself an open base. 

Generally speaking, an open base is useful only if its sets are simple 
in form or few in number. For instance, a space which has a countable 
open base has many pleasant properties. A space of this kind is said to 
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be a second countable space, or to satisfy the second axiom of countability.' 
It is easy to see that any subspace of a second countable space is also 
second countable, for the class of all intersections with the subspace of 
sets in an open base is evidently an open base for the subspace. The 
central fact about second countable spaces can be stated as follows. 


Theorem A (Lindelof’s Theorem). Let X be a second countable space. 
If a non-empty open set G in X ts represented as the union of a class {G;} of 
open sets, then G can be represented as a countable union of G's. 

proor. Let {B,} be a countable open base for X. Let x bea point in G. 
The point z is in some G;, and we can find a basic open set B, such that 
xz&€B, CG. If we do this for each point z in G, we obtain a subclass of 
our countable open base whose union is G, and this subclass is necessarily 
countable. Further, for each basic open set in this subclass we can 
select a G; which contains it. The class of G;,’s which arises in this way is 
clearly countable, and its union is G. 


Most applications of Lindeléf’s theorem depend more directly on 
the following simple consequence of it. 


Theorem B. Let X be a second countable space. Then any open base for 
X has a countable subclass which is also an open base. 

proor. Let {B,} be a countable open base and {B;} an arbitrary open 
base. Since each B, is a union of B,’s, we see by Lindeléf’s theorem that 
each non-empty B, is the union of a countable class of B,’s. In this 
way we obtain a countable family of countable classes of B,’s. The 
union of this family of classes is evidently an open base which is a counta- 
ble subclass of the open base {B;}. 


If a topological space X has a countable open base {B,,}, then it also 
has a countable dense subset. To see this, we have only to select a 
point in each non-empty B, and to note that the set of all these points is 
countable and dense in X. Thus every second countable space is 
separable. This simple result admits the following partial converse. 


Theorem C. Every separable metric space is second countable. 
PRoor. Let X be a separable metric space, and let A be a countable 
dense subset. If we consider the open spheres with rational radii 
centered on all the points of A, then the class of all these open spheres is a 
countable class of open sets. We show that itis an open base. Let G be 
an arbitrary non-empty open set and x a point in G. We must find an 
open sphere in our class which contains x and is contained in G. Let 
1A first countable space—or a space which satisfies the first axiom of countability— 


is a topological space which has a countable open base at each of its points (see 
Sec. 17). 
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S,(x) be an open sphere centered on z and contained in G, and consider the 
concentric open sphere S,;3(z) with one-third its radius. Since A is 
dense, there exists a point ain A which is in S,;3(). Let r; be a rational 
number such that r/3 <7: < 2r/3. We conclude the proof by observing 
that x ¢ S,,(a) C S,(z) C G. 


In order to form the simplest 
intuitive picture of our next concept, 
we give a brief discussion of rec- 
tangles and strips in the Euclidean 
plane R?. Figure 22 is intended to 
illustrate our remarks. If (a;,b;) 
and (a2,b2) are bounded open inter- 
vals—one on the z; axis and the other 
on the x, axis—then their product 


(a1,b:) X (a2,b2) = { (1,02): 
a; <x; <b; for? = 1,2} 


Xp axis 


X, axis 
is called an open rectangle in R?. A 
closed rectangle is defined similarly,as Fig. 22. Open strips and an open 
a product of two closed intervals. rectangle. 

It is easy to prove (see Problem 8) 

that the class of all open rectangles is an open base for the Euclidean 
plane. We now observe that each open rectangle is the intersection of 
two open strips, in the following sense. We call sets of the form 


(a1,b1) X R = {(x1,22):01 < 21 < bi, Ze arbitrary} 
and RX (a2,b2) = {(21,2%2):a2 < 22 < be, 21 arbitrary} 


open strips in R?. If we use closed intervals here, we get what we call 
closed strips. It is plain that 


(@1,b1) X (a2,b2) = [(a1,b1) X BR] O [BR X (a2,b2)). 


Since every open strip in R? is clearly an open set, the class of all open 
strips is a class of open sets whose finite intersections form an open base, 
namely, the open base composed of the open strips, the open rectangles, 
the empty set, and the full space R?. 

Now let X be a topological space. An open subbase is a class of open 
subsets of X whose finite intersections form an open base. This open 
base is called the open base generated by the open subbase. We refer to 
the sets in an open subbase as subbasic open sets. It is easy to see that 
any class of open sets which contains an open subbase is also an open 
subbase. Since the bounded open intervals on the real line constitute 
an open base for this space, it is clear that all open intervals of the type 
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(a,+0) and (—~,b), where a and 6 are real numbers, form an open 
subbase. The open base generated by this open subbase consists of all 
open intervals of this kind, all bounded open intervals, the empty set, and 
the full space R. The ideas in the previous paragraph show at once that 
all open strips in the Euclidean plane form an open subbase for this space. 

The practical value of open subbases rests mainly on the following 
theorem. 


Theorem D. Let X be any non-empty set, and let S be an arbitrary class 
of subsets of X. Then S can serve as an open subbase for a topology on X, in 
the sense that the class of all unions of finite intersections of sets in S is a 
topology. 

proor. If S is empty, then the class of all finite intersections of its sets 
is the single-element class {X}, and the class of all unions of sets in this 
class is the two-element class {9,X}. Since this is the topology described 
in Example 16-3, we may assume that S is non-empty. Let B be the 
class of all finite intersections of sets in S, and let T be the class of all 
unions of sets in B. We must show that T is a topology. T clearly 
contains § and X, and is closed under the formation of arbitrary unions. 
All that remains is to show that if {Gi, G2, ..., Ga} is a non-empty 
finite class of sets in T, then G = (2, G; is also in T. Since the empty 
set is in T, we may assume that Gis non-empty. Let z be a point in G. 
Then z is in each G;, and by the definition of T, for each 7 there is a set 
B,;in B such that z¢ B; C G;. Since each B; is a finite intersection of sets 
in S, the intersection of all sets in $ which arise in this way is a set in B 
which contains z and is contained in G. We conclude the proof by 
noting that this shows that G is a union of sets in B and is thus itself a 
set in T. 


We speak of the topology in this theorem as the topology generated by 
the class $. As we shall see in later chapters, this theorem, though not 
particularly valuable as an end in itself, is quite a useful tool. It is 
normally used in the following manner. If X is a non-empty set, and if 
we have a class of subsets of X which we wish to regard as open sets, all 
we have to do is form the topology generated by this class in the sense of 
Theorem D. 

Our next result often makes much lighter the task of proving that a 
given specific mapping is either continuous or open. 


Theorem E. Let f:X — Y be a mapping of one topological space into 
another, and let there be given an open base in X and an open subbase with 
tts generated open basein Y. Then (1) f is continuous = the inverse image 


Topological Spaces 103 


of each basic open set is open = the inverse image of each subbasic open set is 
open; and (2) f is open = the image of each basic open set ts open. 

PROOF. These statements are immediate consequences of the defini- 
tions and, respectively, Eqs. 3-(2) and 3-(3) and Eq. 3-(1). 


We put these two theorems to work in the next section, where we 


develop a fragment of lattice theory which is very useful in the applica- 
tions of topology to modern analysis. 


Problems 


1. 


Let X be a topological space, and B an open base with the property 

that each point in the space is contained in a basic open set different 

from X. Show that if @ and X happen to be in B, then the class 
which results when these two sets are dropped from B is still an open 
base. 

Under what circumstances is the metric space defined in Example 

9-1 separable? 

Show that the real line and the complex plane are separable. 

Show also that R* and C* are separable. Show finally that R~ and 

C~ are separable. 

Let X be the metric space whose points are the positive integers and 

whose metric is that defined in Example 9-1, and show that C(X,R) is 

not separable. (Hint: if {f,} is a sequence in C(X,R), and if f is 
the function in @(X,R) defined by f(n) = 0 if |f,(n)| > 1 and 

S(n) = |fn(n)| + 1 if |f.(n)| <1, then ||f — f,|| 2 1 for every n.) 

Let X be any non-empty set with the metric defined in Example 9-1, 

and show that C(X,R) is separable = X is finite. 

The following example demonstrates that a topological space with a 

countable dense subset need not be second countable. Let X be the 

set of all real numbers with the topology described in Example 16-4. 

(a) Show that any infinite subset of X is dense. 

(6) Show that X is not second countable. (Hint: assume that 
there exists a countable open base, let x» be a fixed point in X, 
show that the intersection of all basic open sets which contain 
Xo is the single-element set {x.}, and conclude from this that the 
complement of {xo} is countable.) 

Show that the set of all isolated points of a second countable space is 

empty or countable. Show from this that any uncountable subset 

A of a second countable space must have at least one point which is a 

limit point of A. 

Prove in detail that the open rectangles in the Euclidean plane form 

an open base. 
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9. Let f:X — Y be a mapping of one topological space into another. 
f is said to be continuous at a point x, in X if for each neighborhood 
H of f(ao) there exists a neighborhood G of x such that f(G) C H. 
(a) Show that f is continuous © it is continuous at each point in X. 
(b) If there is given an open base in Y, show that f is continuous 
at 2) for each basic open set B which contains f(x) there 
exists a neighborhood G of zo such that f(G) C B. 
(c) If Y is a metric space, show that f is continuous at x» = for 
each open sphere S,(f(zo)) centered on f(xo) there exists a 
neighborhood G of zo such that f(@) C S,(f(x0)). 


19. WEAK TOPOLOGIES 


Let X be a non-empty set. If T, and T, are topologies on X such 
that T; C T2, we say that T, is weaker than T, (or T2 is stronger than T,). 
In rough terms, one topology is weaker than another if it has fewer open 
sets, and stronger than another if it has more open sets. The topology 
{6,X} is the weakest topology on X, for it is weaker than every topology; 
and the discrete topology is the strongest topology on X, since it is stronger 
than every topology. It is clear that the family of all topologies on X isa 
partially ordered set with respect to the relation “is weaker than.” 

We next show that this partially ordered set is a complete lattice. 
In Problem 16-1 we asked the reader to prove that the intersection of any 
two topologies T, and T; on X is a topology on X. Since this topology is 
evidently weaker than both T, and T, and stronger than any topology 
which is weaker than both, it is the greatest lower bound of Ti andT2. It 
is equally easy to see that the intersection of any non-empty family of 
topologies on X is a topology on X; and since it is weaker than all these 
and stronger than any topology which is weaker than all these, it is the 
greatest lower bound of this family. What about least upper bounds? 
The situation here is a bit different, for the union of two topologies on 
X need not be a topology. However, if we have any non-empty family 
of topologies T;, then the discrete topology is a topology stronger than 
each T;. We can therefore appeal to our above remarks to conclude that 
the intersection of all topologies which are stronger than each T; is a 
topology; and since it is stronger than each T; and weaker than any 
topology which is stronger than each T;, it is the least upper bound of 
our given family. 

We summarize the results of this discussion in the following theorem. 


Theorem A. Let X be a non-empty set. Then the family of all topologies 
on X is a complete lattice with respect to the relation ‘is weaker than.” 
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Furthermore, this lattice has a least member (the weakest topology on X) and 
a greatest member (the discrete topology on X). 


The reader will observe that if {T;} isa non-empty family of topologies 
on our set X, then the least upper bound of this family is precisely the 
topology generated by the class U,T; in the sense of Theorem 18-D; that 
is, the class UT; is an open subbase for the least upper bound of the 
family {T;}. In the present context, therefore, Theorem 18-D can be 
thought of as providing a mechanism for the direct construction of least 
upper bounds in our lattice of topologies. 

Let X be a non-empty set, let {X;} be a non-empty class of topo- 
logical spaces, and for each 7 let f; be a mapping of X into X,. It is clear 
that if X is given its discrete topology, then all the f,’s are continuous. 
If we look a little further, we may find other and weaker topologies on 
X which also have this property. There is, in fact, a unique weakest 
topology of this kind. The weak topology generated by the f;’s is defined to 
be the intersection of all topologies on X with respect to each of which all 
the f,’s are continuous mappings. This is clearly a topology on X which 
makes all the f,’s continuous, and it is weaker than any topology which 
has this property. It will appear in later chapters that many a topology 
which is used in practice is defined to be the weak topology generated by 
some set of mappings of particular interest in a given situation. 


Problems 


1. Let X be a non-empty set and {X;} a non-empty class of topological 
spaces. If for each 7 there is given a mapping f; of X into X;, denote 
by T the weak topology on X generated by the f,’s. 

(a) Show that T equals the topology generated by the class of all 
inverse images in X of open sets in the X,’s. 

(b) If an open subbase is given in each X;, show that T equals the 
topology generated by the class of all inverse images in X of 
subbasic open sets in the X,’s. 

(c) If Y is a subspace of the topological space (X,T), show that 
the relative topology on Y is the weak topology generated by 
the restrictions of the f,’s to Y. 

2. In each of the following we specify a set {f;} of real functions defined 
on the real line R. In each case give a complete description of the 
weak topology on R generated by the f,’s. 

(a) {fi} consists of all constant functions. 

(b) {f.} consists of a single function f, defined by f(z) = 0 if z < 0 
and f(z) = lifz>0. 
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(c) {f:} consists of a single function f, defined by f(z) = —1 if 
xz <0,f@) = 0, and f(r) = 1 ifz>0. 

(d) {f:} consists of a single function f, defined by f(x) = z for all z. 

(e) {f:} consists of all bounded functions which are continuous 
with respect to the usual topology on FR. 

(f) {f:} consists of all functions which are continuous with respect 
to the usual topology on R. 


20. THE FUNCTION ALGEBRAS @(X,R) AND @(X,C) 


Let X be an arbitrary topological space. We generalize the nota- 
tions established in Sec. 14 by defining C(X,R) and @(X,C) to be the sets 
of all bounded continuous functions defined on X which are, respectively, 
real and complex. 

It is desirable to extend our discussion of the algebraic structure ot 
these sets beyond that given in Sec. 14 by introducing the following 
concepts. An algebra is a linear space whose vectors can be multiplied 
in such a way that 

(1) x(yz) = (zy)z; 

(2) 2x(y +2) = cy + xz and (x + y)z = re + yz; 

(3) a(zy) = (ax)y = x(ay) for every scalar a. 

We speak of a real algebra or a complex algebra according as the scalars 
are the real numbers or the complex numbers. A commutative algebrais an 
algebra whose multiplication satisfies the following condition: 

(4) sy = yz. 

In the case of a commutative algebra, the second part of (2) is clearly 
redundant. An algebra with identity is an algebra which possesses the 
following property: 

(5) there exists a non-zero element in the algebra, denoted by 1 and 
called the identity element (or the identity), such that 
1:+-2=2a2:1= 2 for every x. 

We speak of the identity because the identity in an algebra (if it has one) 
is unique; for if 1’ is also an element such that 1’ - z = 2° 1! = 2 for 
every z, then 1’ = 1’-1= 1. A subalgebra of an algebra is a linear 
subspace which contains the product of each pair of its elements. A 
subalgebra of an algebra is evidently an algebra in its own right. 

In the case of a function space which is also an algebra, it is to be 
understood that multiplication is defined pointwise, that is, that the 
product fg of two functions in the space is defined by (fg)(x) = f(x)g(z). 
This pointwise multiplication of functions should be clearly distinguished 
from the multiplication (or composition) of mappings discussed at the 
end of Sec. 3. If such an algebra has an identity element 1, then Problem 
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1 shows that in all cases of interest to us this identity is the constant 
function defined by 1(z) = 1 for all z. 
We prove two lemmas before going on to our main theorems. 


Lemma. If f and g are continuous real or complex functions defined on a 
topological space X, then f + g, af, and fg are also continuous. Further- 
more, if f and g are real, then fxg and fv g are continuous. 

PROOF. We illustrate the method by showing that fg and fvg are 
continuous. 

We prove that fg is continuous by showing that it is continuous at an 
arbitrary point zo in X (see Problem 18-9). Let « > 0 be given, and 
find ¢, > 0 such that e:(|f(z0)| + |g(xo)|) + 412 <«. Since f is con- 
tinuous, and thus continuous at xo, there exists a neighborhood G, of 
x) such that 2¢G1 => |f(z) — f(zo)| <a. Similarly, there exists a 
neighborhood G2 of x) such that x&€ G2= |g(x) — g(xo)| <a. The 
continuity of fg at 2) now follows from the fact that G = Gif\ G, is a 
neighborhood of x such that 


re G => |(fg)(x) — (fg) (xo)| = |f(x)g(x) — f(xo)g(xo)| 
= |[f(x)g(x) — f(x)g(xo)] + [f(2)g (to) — f(x0)g(xo)]| 
< |Ff(z)| lg) — g(xo)l + lg (xo)! |f(x) — f(x0)| < el f(z)| + ealg(zo)| 
= e|[f(x) — f(eo)] + f(eo)| + exlg(ao)| < el f(x) — f(ao)| + eal f(x0)| 
+ e:lg(xo)| < e1(|f(x0)] + |g(xo)|) + 41? <e. 


We prove that f v g is continuous by recalling that all sets of the form 
A = (a,+) and B = (— ~,b) form an open subbase for the real line 
and by showing that the inverse image of any such set is open (see 
Theorem 18-E). All that is necessary is to observe that 


(fv g)- A) = {z:max {f(z),g(z)} > a} = {x:f(z) > a} U {x:g(z) > a}, 
which is open since it is the union of two open sets, and that 

(fv g)-(B) = {x:max {f(x),g(x)} <b} = {x:f(z) <b} A {x:9(z) < 5}, 
which is open since it is the intersection of two open sets. 


Lemma. Let X be a topological space, and let {f,} be a sequence of real 
or complex functions defined on X which converges uniformly to a function f 
defined on X. If all the f,’s are continuous, then f is also continuous. 

proor. We show that f is continuous by showing that it is continuous 
at an arbitrary point zo>in X. Lete > Obegiven. Sincefis the uniform 
limit of the f,’s, there exists a positive integer n» such that | f(z) — f,,(x)| < 
¢/3 for all points zin X. Since f,, is continuous, and thus continuous at 
Zo, there exists a neighborhood G of zo such that x¢G => |f,,(x) — 
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Sn.(%0)| < €/3. The continuity of f at zo now follows from the fact that 


reG = |f(z) — f(xo)| 
= |[f(z) — fas(x)] + [fno(®) — fas(20)] + [fns(t0) — f(x0)]{ 
< f(a) — fao()| + |fno(2) — Fro(20)| + |fno(%0) — f(20)| 
<3 + ¢/8 + ¢/8 =e. 


This lemma is often stated more informally as follows: any uniform 
limit of continuous functions ts continuous. 

We are now in a position to give Theorem 14-A the following broader 
and richer form. 


Theorem A. Let C(X,R) be the set of all bounded continuous real functions 
defined on a topological space X. Then (1) C(X,R) is a real Banach space 
with respect to pointwise addition and scalar multiplication and the norm 
defined by ||f|| = sup |f(x)|; (2) if multiplication is defined pointwise, 
C(X,R) ts a commutative real algebra with identity in which ||fg|] < |lF\l Ilgll 
and ||1|| = 1; and (3) if f < g ts defined to mean that f(x) < g(x) for all x, 
C(X,R) is a lattice in which the greatest lower bound and least upper bound 
of a pair of functions f and g are given by (f ag)(x) = min {f(x),g(x)} and 
(fv g)(x) = max {f(z),g(x)}. 

PRooF. In view of the above lemmas, everything stated here is clear, 
except perhaps the fact that ||fg\| < ||/\l llgl]; and this follows from 


lfoll = sup |(fg)(x)| = sup |f(z)g(x)| = sup |f(z)| lg()| 
< (sup |f(z)|)(sup |g(z)|) = yl llgil- 


We also extend Theorem 14-B, but in a slightly different direction. 


Theorem B. Let €(X,C) be the set of all bounded continuous complex func- 
tions defined on a topological space X. Then (1) C(X,C) is a complex 
Banach space with respect to pointwise addition and scalar multiplication 
and the norm defined by ||f|| = sup |f(x)|; (2) if multiplication is defined 
pointwise, C(X,C) is a commutative complex algebra with identity in which 
Ilfgll_< \IFll llgl| and |]1|| = 1; and (3) if f is defined by f(x) = f(z), then 
f—fisa mapping of the algebra C(X,C) into itself which has the following 
properties: FF g =f+g,of =a-f,fgo=f-9,f =f, and |lfll = lf. 
PRooF. This theorem is a direct consequence of the background pro- 
vided above. We do remark, however, that the fact that f is continuous 
when f is follows from |f(z) — f(xo)| = |f(x) — f(zo)}. 


The function f defined in this theorem is called the conjugate of the 
function f, and the operation of forming f from f is called conjugation. 
It will become clear in the later chapters of this book that the operation of 
conjugation in the space @(X,C) is one of the chief supporting pillars 
of the theory we develop in those chapters. 
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We trust that the reader has noticed our insistence on assuming that 
a topological space always has at least one point. Our reason for this is 
that the empty set has no functions defined on it. If we were to allow a 
topological space X to be empty, then we would have to cope with the 
fact that its corresponding C(X,R) and C(X,C) are also empty, and so 
cannot be linear spaces, for a linear space must contain at least one vector 
(the zero vector). Since constant functions are always continuous, we 
avoid this difficulty by taking pains to assume that topological spaces 
are non-empty. 


Problems 


1. Let A be an algebra of real or complex functions defined on a non- 
empty set X, and assume that for each point xin X there is a function 
f in A such that f(z) #0. Show that if A contains an identity 
element 1, than 1(r) = 1 for all z. 

2. Let f be a continuous real or complex function defined on a topological 
space X, and assume that f is not identically zero, i-e., that the set 
Y = {x:f(xz) ~ 0} is non-empty. Prove in detail that the function 
1/f defined by (1/f)(x) = 1/f(x) is continuous at each point of the 
subspace Y. 

3. Let X be a topological space and A a subalgebra of C(X,R) or ©(X,C). 
Show that its closure A is also a subalgebra. If A is a subalgebra 
of @(X,C) which contains the conjugate of each of its functions, 
show that A also contains the conjugate of each of its functions. 


CHAPTER FOUR 


Compactness 


Like many other notions in topology, the concept of compactness 
for a topological space is an abstraction of an important property pos- 
sessed by certain sets of real numbers. The property we have in mind 
is expressed by the Heine-Borel theorem, which asserts the following: 
if X is a closed and bounded subset of the real line R, then any class of 
open subsets of R whose union contains X has a finite subclass whose 
union also contains X. If we regard X as a topological space in its own 
right, as a subspace of R, then this theorem can be thought of as saying 
that any class of open subsets of X whose union is X has a finite subclass 
whose union is also X. 

The Heine-Borel theorem has a number of profound and far-reaching 
applications in analysis. Many of these guarantee that continuous 
functions defined on closed and bounded sets of real numbers are well 
behaved. For instance, any such function is automatically bounded 
and uniformly continuous. In contrast to this satisfying behavior, 
we note that the function f defined on the open unit interval (0,1) by 
f(x) = 1/2 is neither bounded nor uniformly continuous. 

As is often the case with crucial theorems in analysis, the conclusion 
of the Heine-Borel theorem is converted into a definition in topology. 
This definition singles out for special attention what are called compact 
topological spaces. Our main business in this chapter is to develop the 
basic properties of these spaces and the continuous functions they carry, 
and, in the case of metric spaces, to establish several equivalent forms of 
compactness which are useful in applications. 
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21. COMPACT SPACES 


Let X be a topological space. A class {G;} of open subsets of X is 
said to be an open cover of X if each point in X belongs to at least one Gi, 
that is, if U,G; = X. A subclass of an open cover which is itself an 
open cover is called a subcover. A compact space is a topological space 
in which every open cover has a finite subcover. A compact subspace of a 
topological space is a subspace which is compact as a topological space 
in its own right. We begin by proving two simple but widely used 
theorems. 


Theorem A. Any closed subspace of a compact space is compact. 


proor. Let Y be a closed subspace of a compact space X, and let {G;} 
be an open cover of Y. Each G,, being open in the relative topology 
on Y, is the intersection with Y of an open subset H,; of X. Since Y is 
closed, the class composed of Y’ and all the H;,’s is an open cover of X, 
and since X is compact, this open cover has a finite subcover. If Y’ 
occurs in this subcover, we discard it. What remains is a finite class of 
H,’s whose union contains X. Our conclusion that Y is compact now 
follows from the fact that the corresponding G,’s form a finite subcover 
of the original open cover of Y. 


Theorem B. Any continuous image of a compact space is compact. 

PRoor. Let f:X — Y be a continuous mapping of a compact space X 
into an arbitrary topological space Y. We must show that f(X) is a 
compact subspace of Y. Let {G;} be an open cover of f(X). Asin the 
above proof, each G, is the intersection with f(X) of an open subset H; 
of Y. Itis clear that {f-1(H;)} is an open cover of X, and by the com- 
pactness of X it hasa finite subcover. The union of the finite class of H,’s 
of which these are the inverse images clearly contains f(X), so the class 
of corresponding G;,’s is a finite subcover of the original open cover of 
f(X), and f(X) is compact. 


It is sometimes quite difficult to prove that a given topological space 
is compact by appealing directly to the definition. The following theo- 
rems give several equivalent forms of compactness which are often easier 
to apply. 


Theorem C. A topological space is compact = every class of closed sets 
with empty intersection has a finite subclass with empty intersection. 

proor. This is a direct consequence of the fact that a class of open 
sets is an open cover = the class of all their complements has empty 
intersection. 
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We recall from Problem 8-6 that a class of subsets of a non-empty 
set is said to have the finite intersection property if every finite subclass 
has non-empty intersection. This concept enables us to express Theo- 
rem C as follows. 


Theorem D. A topological space is compact = every class of closed sets 
with the finite intersection property has non-empty intersection. 


Let X be a topological space. An open cover of X whose sets are all 
in some given open base is called a basic open cover, and if they all lie 
in some given open subbase, it is called a subbasic open cover. We observe 
the trivial fact that if X is compact, then every basic open cover has a 
finite subcover. Our next theorem asserts that compactness not only 
implies this property, but is also implied by it. 


Theorem E. A topological space is compact if every basic open cover has a 
finite subcover. 

PROOF. Let {G;} be an open cover and {B;} an open base. Each G; is 
the union of certain B,’s, and the totality of all such B,’s is clearly a basic 
open cover. By our hypothesis, this class of B,’s has a finite subcover. 
For each set in this finite subcover we can select a G; which contains it. 
The class of G,’s which arises in this way is evidently a finite subcover 
of the original open cover. 


We go one more step in this direction and prove a similar (and much 
deeper) theorem relating to subbasic open covers. The proof is rather 
difficult, and we introduce the following concepts in an effort to make 
it as simple as possible. They also make some of its applications con- 
siderably easier to handle. Let X be a topological space. A class of 
closed subsets of X is called a closed base if the class of all complements 
of its sets is an open base, and a closed subbase if the class of all comple- 
ments is an open subbase. Since the class of all finite intersections of 
sets in an open subbase is an open base, it follows that the class of all 
finite unions of sets in a closed subbase is a closed base. This is called 
the closed base generated by the closed subbase. 


Theorem F. A topological space is compact if every subbasic open cover 
has a finite subcover, or equivalently, if every class of subbasic closed sets 
with the finite intersection property has non-empty intersection. 

proor. The equivalence of the stated conditions is an easy consequence 
of Theorems C and D. Consider a closed subbase for our space, and 
let {B;} be its generated closed base, that is, the class of all finite unions 
of its sets. We assume that every class of subbasic closed sets with the 
finite intersection property has non-empty intersection, and we prove 
from this that every class of B,’s with the finite intersection property 
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also has non-empty intersection. By Theorem E, this will suffice to 
prove our theorem. 

Let {B;} be a class of B,’s with the finite intersection property. 
We must show that ,;B, is non-empty. We use Zorn’s lemma to show 
that {B;} is contained in some class {B,} of B,’s which is maximal with 
respect to having the finite intersection property, in the sense that { B,} 
has this property and any class of B,’s which properly contains {B,} 
fails to have this property. The argument runs as follows. Consider 
the family of all classes of B,’s which contain {B;} and have the finite 
intersection property. This is a partially ordered set with respect to 
class inclusion. If we consider a chain in this partially ordered set, the 
union of all classes in it is a class of B,’s which contains every member 
of the chain and has the finite intersection property, as we see from the 
fact that every finite class of its sets is contained in some member of the 
chain, and that member has the finite intersection property. We con- 
clude that every chain in our partially ordered set has an upper bound, 
so Zorn’s lemma guarantees that the partially ordered set has a maximal 
element. This argument yields the existence of a class {B,} with the 
properties stated above. Since ,B, C ;B;, it now suffices to show 
that ,B, is non-empty. 

Each B;, is a finite union of sets in our closed subbase, for instance, 
B,=8,U8,U --+WUS,. It now suffices to show that at least one 
of the sets S;, So, . . . , S, belongs to the class {B;}. For if we obtain 
such a set for each B,, the resulting class of subbasic closed sets will have 
the finite intersection property (since it is contained in {B,}), and there- 
fore, by our hypothesis relating to the subbasic closed scts, it will have 
non-empty intersection; and since this non-empty intersection will be a 
subset of ()\,B,, we shall know that ™,B, is itself non-empty. 

We finish the proof by showing that at least one of the sets Si, 
So, ..., S, does in fact belong to the class {B,}. We assume that 
each of these sets is not in this class, and we deduce a contradiction from 
this assumption. Since S, is a subbasic closed set, it is also a basic 
closed set; and since it is not in the class {B,}, the class {B,,Si} is a class 
of B,’s which properly contains {B,}. By the maximality property of 
{B,}, the class {B,,S,} lacks the finite intersection property, so Si; is 
disjoint from the intersection of some finite class of B,’s. If we do this 
for each of the sets S,, Sz, . . . , Sn, we see that B,—the union of these 
sets—is disjoint from the intersection of the total finite class of all the 
B,’s which arise in this way. This contradicts the finite intersection 
property for the class {B,} and completes the proof. 


The great power of this theorem can be surmised from the complexity 
of its proof. It is really a tool, and we illustrate the manner in which 
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it can be used by applying it to give a simple proof of the classical Heine- 
Borel theorem stated in the introduction to this chapter. 


Theorem G (the Heine-Bore! Theorem). Every closed and bounded sub- 
space of the real line ts compact. 

proor. A closed and bounded subspace of the real line is a closed sub- 
space of some closed interval [a,b], and by Theorem A it suffices to show 
that [a,b] is compact. If a = 6, this is clear, so we may assume that 
a<b. By Sec. 18, we know that the class of all intervals of the form 
{a,d) and (c,b], where c and d are any real numbers such that a < ¢ < 6 
and a < d <b, is an open subbase for [a,b]; therefore the class of all 
[a,c]’s and all [d,b]’s is a closed subbase. Let S$ = {[a,c,], [d,,b]} be a 
class of these subbasic closed sets with the finite intersection property. 
It suffices by Theorem F to show that the intersection of all sets in S$ is 
non-empty. We may assume that S is non-empty. If S contains only 
intervals of the type [a,c,], or only intervals of the type [d,,b], then the 
intersection clearly contains a or 6. We may thus assume that S$ con- 
tains intervals of both types. We now define d by d = sup {d;}, and 
we complete the proof by showing that d < ¢;for every 7. Suppose that 
c, <d for some zo. Then by the definition of d there exists a d;, such 
that c,, < dj; Since [a,c,,] ( [d;,,b] = 9, this contradicts the finite 
intersection property for S and concludes the proof. 


The reader should understand that there are elementary proofs of the 
Heine-Borel theorem which do not use Theorem F or anything like it. 
Theorem F will render us its major service in connection with the proof 
of the vital Tychonoff theorem of Sec. 23. 


Problems 


1. A countably compact space is a topological space in which every count- 
able open cover has a finite subcover. Prove that a second countable 
space is countably compact © it is compact. 

2. Let Y be a subspace of a topological space X. If Z is a non-empty 
subset of Y, show that Z is compact as a subspace of Y © it is com- 
pact as a subspace of X. 

3. Let X be a topological space. If {X;} is a non-empty finite class 
of compact subspaces of X, show that U;X; is also a compact sub- 
space of X. If {X;} is a non-empty class of compact subspaces of X 
each of which is closed, and if (\;X; is non-empty, show that ,X; 
is also a compact subspace of X. 

A, Let X be a compact space. We know by Theorem A that every 
closed subspace of X is compact. By considering Example 16-3, 
show that a compact subspace of X need not be closed. 
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5. Prove the converse of the Heine-Borel theorem: every compact sub- 
space of the real line is closed and bounded. 

6. Generalize the preceding problem by proving that a compact sub- 
space of an arbitrary metric space is closed and bounded. (It should 
be carefully noted, as Secs. 24 and 25 will show, that a closed and 
bounded subspace of an arbitrary metric space is not necessarily 
compact.) 

7. Show that a continuous real or complex function defined on a com- 
pact space is bounded. More generally, show that a continuous 
mapping of a compact space into any metric space is bounded. 

8. Show that a continuous real function f defined on a compact space X 
attains its infimum and its supremum in the following sense: if 
a=inf {f(x):~2¢ X} and b = sup {f(z):x¢ X}, then there exist 
points x; and x, in X such that f(z1) = a and f(x2) = b. 

9. If X is a compact space, and if {f,} is a monotone sequence of con- 
tinuous real functions defined on X which converges pointwise to a 
continuous real function f defined on X, show that f, converges 
uniformly to f. (The assumption that {f,} is a monotone sequence 
means that either fi <f2<fs< --corfit= fp2>fs>°-°.) 


22. PRODUCTS OF SPACES 


There are two main techniques for making new topological spaces 
out of old ones. The first of these, and the simplest, is to form subspaces 
of some given space. The second is to multiply together a number of 
given spaces. Our purpose in this section is to describe the way in which 
the latter process is carried out. 

In Sec. 4 we defined what is meant by the product P,;X; of an arbi- 
trary non-empty class of sets. We also defined the projection p; of this 
product onto its 7th coordinate set X;. The reader should make certain 
that these concepts are firmly in mind. If each coordinate set is a 
topological space, then there is a standard method of defining a topology 
on the product. It is difficult to exaggerate the importance of this 
definition, and we examine it with great care in the following discussion. 

Let us begin by recalling the discussion in Sec. 18 of open rectangles 
and open strips in the Euclidean plane R?. We observed there that the 
open rectangles form an open base for the topology of R?, and also that 
the open strips form an open subbase for this topology whose generated 
open base consists of all open rectangles, all open strips, the empty set, 
and the full space. The topology of the Euclidean plane is of course 
defined in terms of a metric. If we wish, however, we can ignore this 
fact and regard the topology of R? as generated in the sense of Theorem 
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18-D by the class of all open strips. This situation provides the motiva- 
tion for the more general ideas we now develop. 
Let X, and X_. be topological spaces, and form the product 
X = Xi X X_ of the two sets X; and X2. Consider the class S of all 
subsets of X of the form Gi X Xz and X1 X G2, where G; and G2 are 
open subsets of X, and Xo, respectively. The topology on X generated 
by this class in the sense of Theorem 18-D is called the product topology 
on X. The product topology therefore has S as an open subbase; in 
fact, it is defined by the requirement that S be an open subbase. The 
open base generated by S, that is, the class of all finite intersections of 
its sets, is clearly the class of all sets of the form G: X G2, and the open 
sets in X are all unions of these 


Xx, sets. There are two projections 
pi and pe of X onto its coordinate 
*2 spaces X, and Xo, and by defini- 
tion they carry a typical element 
(t1,%2) of X to 2, and 22, respec- 
G, tively. We note that S is precisely 


the class of all inverse images 
in X of all open subsets of X, 
and X-. under these projections: 
G, X Xe = pi *'(G)) and X1 X G2 = 
po (G2). The product topology is 
Fig. 23. The product topology on thus a topology on the product with 
XX X2. respect to which both projections 

are continuous mappings, and it is 
evidently the weakest such topology. In terms of the ideas discussed in 
Sec. 19, the product topology can be regarded as the weak topology 
generated by the projections. Figure 23 may assist the reader in vis- 
ualizing some of these notions. 

With the above concepts to guide the way, one more step carries us 
to the product topology in its full generality. Let {X,;} be any non- 
empty class of topological spaces, and consider the product X = P;X; 
of the sets X;. A typical element x in X is an array x = {2;} of points 
in the coordinate spaces, where each x; belongs to the corresponding 
space X,; and for each index 7, the projection p; is defined by p,(x) = 2. 
We now define the product topology on X to be the weak topology gen- 
erated by the set of all projections. This means the product topology 
is that generated by the class S of all inverse images in X of open sets 
in the X,’s, that is, the class S of all subsets of X of the form S = p,“(Gi), 
where 7 is any index and G; is any open subset of X;. It is easy to see 
that S can also be described as the class of all products of the form 
S = P,Gi, where G; is an open subset of X,; which equals X; for all 7’s but 
one. The class S is called the defining open subbase for the product 
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topology; and the class of all complements of sets in S—namely, the 
class of all products of the form P,F;, where F; is a closed subset of X; 
which equals X; for all z7’s but one—is called the defining closed subbase. 
The open base generated by S, that is, the class of all finite intersections 
of its sets, is called the defining open base for the product topology; and 
this is evidently the class of all products of the form P,G;, where G; is an 
open subset of X; which equals X; for all but a finite number of 7’s. It 
should be clearly understood that an unrestricted prodtict of open sets in 
the coordinate spaces need not be open 

in the product topology. A convenient } 

way of thinking about this defining open 

base is that a typical one of its sets con- 

sists of all points z = {2;} in the prod- 

uct such that the ith coordinate 2; is 


required to lie in an open subset G; of 
X; for a finite number of 7’s, all other 
coordinates being unrestricted. 


When the product of a non-empty 
class of topological spaces is equipped 
with the product topology defined in the 
above paragraph, it is called a product 
space, or more simply, the product of the Fig. 24. Aset in the defining open 
spaces involved.! It should be clear base for a product space. 
from Theorem 18-E and these definitions 
that all projections of a product space onto its coordinate spaces are 
automatically both continuous and open. 

We conclude this section by analyzing an example which we hope will 
increase the reader’s capacity to ‘‘see’’ the structure of product spaces. 
Let the index set J consist of all real numbers 7 in the closed unit interval 
[0,1]. I is to be considered as a set without any structure. Now let 
each index 7 have attached to it a topological space X;, and let every X, 
be a replica of the closed unit interval [0,1] with its usual topology. The 
resulting product space X = P,X; is illustrated in Fig. 24. The base 
of this figure is the index set J, and each vertical cross section represents 
the coordinate space X; attached to the index at its base. An element 
of the product space X is an array of points, one of which lies in each X,. 
Such an element is essentially a function—if we identify a function with 
its graph—defined on the set J, with values in the closed unit interval. 
We can now visualize as follows a typical set in the defining open base for 
the product topology. We choose a finite set of indices, say {11, 42, 7s}, 
and for each of these we choose an open set on the vertical segment above 


0 iy ine 1 


1 Since a topological space must first of all be a non-empty set, it is worth remark- 
ing here that the axiom of choice guarantees that the product of a non-empty class 
of non-empty sets is non-empty. 
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it. Our basic open set then consists of all functions in X whose graphs 
cross each of these three vertical segments within the given open set on 
that segment. In the figure, f belongs to our basic open set, but g does 
not. The product topology on any product space can be visualized in a 
similar way. All one has to do is imagine the coordinate spaces as fibers, 
each attached to a specific element of the index set. The resulting 
mental image of the product space will then look something like a bundle 
of fibers, or perhaps a bed of reeds growing in a pond. 


Problems 


1. All projections, being open mappings, send open sets to open sets. 
Use the Euclidean plane to show that a projection need not send 
closed sets to closed sets. 

2. Show that the relative topology on a subspace of a product space is 
the weak topology generated by the restrictions of the projections 
to that subspace. 

3. Let f be a mapping of a topological space X into a product space 
P;X;, and show that f is continuous = pf is continuous for each 
projection 7;. 

4. Consider the product space defined and discussed in the last para- 
graph of the text, and show that this space is not second countable. 
(Hint: recall Theorem 18-B, and observe that the index set is uncount- 
ably infinite.) 

5. Let X, Y, and Z be topological spaces, and consider a mapping 
z = f(x,y) of the product set X X Y into the set Z. We say that 
f is continuous in x if for each fixed yo the mapping of X into Z 
given by z = f(z,yo) is continuous. The statement that f is con- 
tinuous in y is defined similarly. fis said to be jointly continuous in x 
and y if it is continuous as a mapping of the product space X X Y 
into the space Z. 

(a) If all three spaces are metric spaces, show that f is jointly 
continuous © z, — z and yn — y implies f(rn,yn) — f(x,y). 

(b) Show that if f is jointly continuous, then it is continuous in each 
variable separately. Show that the converse of this statement 
is false by considering the real function defined on the Euclidean 
plane by f(z,y) = zy/(z? + y?) and f(0,0) = 0. 


23. TYCHONOFF’S THEOREM AND LOCALLY COMPACT SPACES 


The main theorem of this section, to the effect that any product of 
compact spaces is compact, is perhaps the most important single theorem 


Compactness 119 


of general topology. We shall use it repeatedly throughout the rest of 
this book, and the reader will come to see that its commanding position 
is due largely to the fact that in the higher levels of our subject many 
spaces constructed for special purposes turn out to be closed subspaces of 
products of compact spaces. Such a subspace is necessarily compact, and 
since compact spaces are so pleasant to work with, this makes the 
resulting theory much cleaner and smoother than would otherwise be the 
case. 


Theorem A (Tychonoff’s Theorem). The product of any non-empty class 
of compact spaces ts compact. 

prooF. Let {X;} be a non-empty class of compact spaces, and form the 
product X = P;X;. Let {F;} be a non-empty subclass of the defining 
closed subbase for the product topology on X. This means that each 
F, is a product of the form F; = P:F.;, where F;; is a closed subset of X; 
which equals X; for all 7’s but one. We assume that the class {F;} has 
the finite intersection property, and by virtue of Theorem 21-F we con- 
clude the proof by showing that ,F; is non-empty. For a given fixed 
t, {Fi} is a class of closed subsets of X; with the finite intersection 
property; and by the assumed compactness of X; (and Theorem 21-D), 
there exists a point 2; in X; which belongs to ,F,;.. If we do this for 
each 1, we obtain a point z = {z,;} in X which is in ™,F;. 


As our first application of Tychonoff’s theorem, we prove an exten- 
sion of the classical Heine-Borel theorem. We prepare the way for this 
proof by defining what we mean by open and closed rectangles in the 
n-dimensional Euclidean space R*. If (a,,b;) is a bounded open interval 
on the real line foreachi = 1,2, . . . ,n, then the subset of R* defined by 


Py (ai,b;) = {(a1, 22, . . . , tn) ia; < a; < dB; for each 7} 


is called an open rectangle in R®. A closed rectangle is defined similarly, as 
a product of 7 closed intervals. 


Theorem B (the Generalized Heine-Borel Theorem). Every closed and 
bounded subspace of R" is compact. 

proor. A closed and bounded subspace of R* is a closed subspace of 
some closed rectangle, so by Theorem 21-A it suffices to show that any 
closed rectangle is compact as a subspace of R*. Let X = Pi, [a:,b,] be 
a closed rectangle in R". Each coordinate space [a,,b;] is compact by the 
classical Heine-Borel theorem, so by Tychonoff’s theorem it suffices to 
show that the product topology on X is the same as its relative topology 
as a subspace of R*. It is easy to see that the open rectangles in R" form 
an open base for its usual topology, that is, for its metric topology, and 
from this it follows that the product topology on R* is the same as its 
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usual topology. By Problem 22-2, the relative topology on X is the 
weak topology generated by its n projections onto the coordinate spaces 
[a;,0:]; but this is the product topology on X, so the proof is complete.! 


The n-dimensional Euclidean space R* is the most important example 
of a type of topological space which is of great significance in modern 
analysis, especially in the theory of integration. A topological space is 
said to be locally compact if each of its points has a neighborhood with 
compact closure. It is easy to see by the above theorem that R* actually 
is locally compact, because any open sphere centered on any point is a 
neighborhood of the point whose closure, being a closed and bounded 
subspace of R", is compact. It is trivial that any compact space is 
locally compact, for the full space is a neighborhood with compact 
closure of every point in the space. We return to the study of locally 
compact spaces in Sec. 37, where we give a more detailed analysis of 
their structure and properties. 


Problems 


1. Prove in detail that the open rectangles in R” form an open base. 

2. Show that every closed and bounded subspace of the n-dimensional 
unitary space C" is compact. 

3. Show that a topological space is locally compact © there is an open 
base at each point whose sets all have compact closures. 

4. Observe that any discrete space is locally compact. Assuming that 
there are topological spaces which are not locally compact (we 
assure the reader that this is true), show that a continuous image of a 
locally compact space need not be locally compact. 


24. COMPACTNESS FOR METRIC SPACES 


In all candor, we must admit that the intuitive meaning of compact- 
ness for topological spaces is somewhat elusive. This concept, however, 
is so vitally important throughout topology that we consider it worth- 
while to devote this and the next section to giving several equivalent 
forms of compactness for the special case of a metric space. Some of 
these are quite useful in applications and are perhaps more directly 
comprehensible than the open cover definition. We hope they will help 


11It is worth remarking that the high-powered machinery used in this proof is 
not really necessary for proving the theorem. There are other proofs which are more 
elementary in nature, but we prefer the one given here because it illustrates some of 
our current concepts and tools. 
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the reader to achieve a fuller understanding of the geometric significance 
of compactness.! 

We begin by recalling the classical Bolzano-Weiterstrass theorem: if 
X is a closed and bounded subset of the real line, then every infinite 
subset of X has a limit point in X. This suggests that we consider the 
property expressed here as one which a general metric space may or may 
not possess. A metric space is said to have the Bolzano-Weterstrass 
property if every infinite subset has a limit point. Another property 
closely allied to this is that of sequential compactness: a metric space is 
said to be sequentially compact if every sequence in it has a convergent 
subsequence. Our main purpose in this section is to prove that each of 
these properties is equivalent to compactness in the case of a metric 
space. The following is an outline of our procedure: we first prove that 
these two properties are equivalent to one another; next, that compact- 
ness implies the Bolzano-Weierstrass property; and finally, that sequential 
compactness implies compactness. The first two of these steps are 
relatively simple, but the last involves several stages. 


Theorem A. A metric space is sequentially compact = it has the Bolzano- 
Weierstrass property. 

proor. Let X be a metric space, and assume first that X is sequentially 
compact. We show that an infinite subset A of X has a limit point. 
Since A is infinite, a sequence {z,} of distinct points can be extracted 
from A. By our assumption of sequential compactness, this sequence 
has a subsequence which converges to a point x. Theorem 12-A shows 
that x is a limit point of the set of points of the subsequence, and since 
this set is a subset of A, z is also a limit point of A. 

We now assume that every infinite subset of X has a limit point, 
and we prove from this that X is sequentially compact. Let {z,} be an 
arbitrary sequence in X. If {z,} has a point which is infinitely repeated, 
then it has a constant subsequence, and this subsequence is clearly 
convergent. If no point of {x,} is infinitely repeated, then the set A of 
points of this sequence is infinite. By our assumption, the set A has a 
limit point z, and it is easy to extract from {z,} a subsequence which 
converges to z. 


Theorem B. Every compact metric space has the Bolzano-Weierstrass 

property. 

PROOF. Let X be a compact metric space and A an infinite subset of X. 

We assume that A has no limit point, and from this we deduce a con- 
1A solid case can be made for the proposition that compact spaces are natural 


generalizations of spaces with only a finite number of points. For a discussion of 
this, and of the significance of compactness in analysis, see Hewitt [19]. 
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tradiction. By our assumption, each point of X is not a limit point of A, 
so each point of X is the center of an open sphere which contains no point 
of A different from its center. The class of all these open spheres is an 
open cover, and by compactness there exists a finite subcover. Since 
A is contained in the set of all centers of spheres in this subcover, A is 
clearly finite. This contradicts the fact that A is infinite, and concludes 
the proof. 


Our next task is to prove that compactness is implied by sequential 
compactness. We carry this out in several stages, the first of which can 
be motivated by the following considerations. Let {G;} be an open cover 
of a metric space X. Then each point x in X belongs to at least one 
G;, and since the G,’s are open, each point z is the center of some open 
sphere which is contained in at least one G;._ If we now move to another 
point of X, we may be forced to decrease the radius of our open sphere 
in order to squeeze it into a G,. Under special circumstances it may not 
be necessary to take radii below a certain level as we move from point to 
point over the entire space. The following concept is useful for handling 
this sort of situation. A real number a > 0 is called a Lebesgue number 
for our given open cover {G;} if each subset of X whose diameter is less 
than a is contained in at least one G,. 


Theorem C (Lebesgue’s Covering Lemma). Jn a sequentially compact 
metric space, every open cover has a Lebesgue number. 

proor. Let X be a sequentially compact metric space, and let {G;} be 
an open cover. We say that a subset of X is “big” if it is not contained 
in any G;. If there are no big sets, then any positive real number will 
serve as our Lebesgue number a. We may thus assume that big sets do 
exist, and we define a’ to be the greatest lower bound of their diameters. 
Clearly, 0< a’ < +. It will suffice to show that a’ > 0; for if 
a’ = +, then any positive real number will do for a, and if a’ is real, 
we can takeatobea’. We therefore assume that a’ = 0, and we deduce 
a contradiction from this assumption. Since every big set must have at 
least two points, we infer from a’ = 0 that for each positive integer n 
there exists a big set B, such that 0 < d(B,) <1/n. We now choose a 
point zt, in each B,. Since X is sequentially compact, the sequence 
{zn} has a subsequence which converges to some point z in X. The 
point x belongs to at least one set G;, in our open cover, and since G,, is 
open, x is the center of some open sphere S,(xz) contained in G,,. Let 
S,/2(z) be the concentric open sphere with radius r/2. Since our sub- 
sequence of {x,} converges to z, there are infinitcly many positive integers 
n for which z, is in S,/2(x). Let no be one of these positive integers 
which is so large that 1/ny < r/2. Since d(B,,) < 1/no < 1/2, we see by 
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Problem 10-3 that B,, C S,(x) C G;. This contradicts the fact that 
B,, is a big set, and completes the proof. 


The next stage requires the following concepts. Let X be a metric 
space. If « > 0 is given, a subset A of X is called an e-net if A is finite 
and X = U,,, S.(a), that is, if A is finite and its points are scattered 
through X in such a way that each point of X is distant by less than 
e from at least one point of A. The metric space X is said to be totally 
bounded if it has an e-net foreach e > 0. It is clear that if X is totally 
bounded, then it is also bounded; for if A is an e-net, then the diameter of 
A is finite (since A is a non-empty finite set) and d(X) < d(A) + 2e. 
Total boundedness is actually a much stronger property than bounded- 
ness, as we shall see below. 


Theorem D. Every sequentially compact metric space is totally bounded. 
proor. Let X be a sequentially compact metric space, and let « > 0 
be given. Choose a point a; in X and form the open sphere S,(a,). If 
this open sphere contains every point of X, then the single-element set 
{a,} is an e-net. If there are points outside of S.(a1), let a2 be such a 
point and form the set S.(a,) U S.(a2). If this union contains every 
point of X, then the two-element set {a1,a2} isan e-net. If we continue in 
this way, some union of the form S,(a:) U S.(az2) U - « - U S.(aa) will 
necessarily contain every point of X; for if this process could be continued 
indefinitely, then the sequence {ai, a2, ..., Qn, ...} would be a 
sequence with no convergent subsequence, contrary to the assumed 
sequential compactness of X. We see by this that some finite set of the 
form {ai, a2, . .. , Gn} is an e-net, so X is totally bounded. 


We are now in a position to complete this line of thought by proving 
that compactness is implied by sequential compactness. 


Theorem E. Every sequentially compact metric space ts compact. 

Proor. Let X be a sequentially compact metric space, and let {G,} be 
an open cover. By Theorem C, this open cover has a Lebesgue number a. 
We put e = a/3, and use Theorem D to find an e-net 


A = {ay, a2, . . . , Gn}. 


For each k = 1, 2,...,n, we have d(S.(ax)) < 2e = 2a/3 <a. By 
the definition of a Lebesgue number, for each k we can find a G,, such 
that S.(a,) € G.,. Since every point of X belongs to one of the S.(a;)’s, 
the class {G,,, G:, . . . , G:,} isa finite subcover of {G;}. X is therefore 
compact. 
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Our results so far can be summarized by the statement that if X is a 
metric space, then the following three conditions are all equivalent to 
one another: 

(1) X is compact; 

(2) X is sequentially compact; 

(3) X has the Bolzano-Weierstrass property. 

Also, of course, we have as by-products the additional information that a 
compact metric space is totally bounded and that every open cover of a 
compact metric space has a Lebesgue number. The latter fact has the 
following useful consequence. 


Theorem F. Any continuous mapping of a compact metric space into a 
metric space is uniformly continuous. 

PROOF. Let f be a continuous mapping of a compact metric space X into 
a metric space Y, and let d, and d: be the metrics on X and Y. Let 
¢ > 0 be given. For each point z in X, consider its image f(x) and the 
open sphere S./2(f(z)) centered on this image with radius «/2. Since f 
is continuous, the inverse image of each of these open spheres is an open 
subset of X, and the class of all such inverse images is an open cover of X. 
Since X is compact, Theorem C guarantees that this open cover has a 
Lebesgue number 6. If x and 2’ are any two points in X for which 
d,(x,z') < 6, then the set {z,z’} is a set with diameter less than 6, both 
points belong to the inverse image of some one of the above open spheres, 
both f(x) and f(x’) belong to one of these open spheres, and therefore 
d2(f(x), f(x’)) < e, which shows that f is indeed uniformly continuous. 


We continue our study of compact metric spaces in the next section. 


Problems 


1. Let A be a subspace of a metric space X, and show that A is totally 
bounded = A is totally bounded. 

2. Show that a subspace of R* is bounded © it is totally bounded. 

3. Prove the Bolzano-Weierstrass theorem for R”: if X is a closed and 
bounded subset of R”, then every infinite subset of X has a limit 
point in X. 

4, Show that a compact metric space is separable. 


25. ASCOLI’S THEOREM 


Our previous characterizations of compactness for a metric space 
strongly suggest that this property is related to completeness and total 
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boundedness in some way yet to be formulated. We begin by proving a 
theorem which clarifies this situation. 


TheoremA. A metric space is compact = it 1s complete and totally bounded. 
PRooF. Let X bea metric space. The first half of our proof is easy, for 
if X is compact, then it is totally bounded by Theorem 24-D, and it is 
complete by Problem 12-2 and the fact that every sequence (and therefore 
every Cauchy sequence) has a convergent subsequence. 

We now assume that X is complete and totally bounded, and we 
prove that X is compact by showing that every sequence has a convergent 
subsequence. Since X is complete, it suffices to show that every sequence 
has a Cauchy subsequence. Consider an arbitrary sequence 


Si = {ru X12, L123, . - .}. 


The reason for this notation will soon be clear. Since X is totally 
bounded, there exists a finite class of open spheres, each with radius 4, 
whose union equals X; and from this we see that S,; has a subsequence 
Se = {xe1, X22, Z23, . . -} all of whose points lie in some one open sphere 
of radius 44. Another application of the total boundedness of X shows 
similarly that S, has a subsequence S3 = {x31, 232, 233, . . .} all of 
whose points lie in some one open sphere of radius 4. We continue 
forming successive subsequences in this manner, and we let 


S= {rn, 22, £33, - - .} 


be the “‘diagonal’’ subsequence of S;. By the nature of this construction, 
S is clearly a Cauchy subsequence of S;, and our proof is complete. 


This theorem gives total boundedness a prominent part in deter- 
mining whether a metric space is compact or not. As we know, many 
metric spaces occur as closed subspaces of complete metric spaces, and 
for these we can make the role of total boundedness even more striking. 


Theorem B. A closed subspace of a complete metric space is compact = it 
as totally bounded. 

PROOF. Since a closed subspace of a complete metric space is auto- 
matically complete, this is a direct consequence of Theorem A. 


What sort of property is total boundedness? We have seen that it 
always implies boundedness, and we know by Problem 24-2 that the 
converse of this is true for subspaces of the finite-dimensional Euclidean 
space R". It is false, however, that boundedness implies total bounded- 
ness for subspaces of the infinite-dimensional Euclidean space R». In 
fact, the closed unit sphere in R, defined by X = {z:]|z|| < 1}, is not 
totally bounded, though it is obviously bounded. To see this, it suffices 
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eS {1, 0, 0, oe , 0, ry J, 
Pena (i ts ee | eee 
3 = {0, 0, 1, 7) }, 


has no convergent subsequence, for the distance from any point of the 
sequence to any other is 2%. This shows that X is not compact, hence 
not totally bounded. The following fact, which we cannot prove here 
(see Sec. 47), may add to the reader’s intuition about the relation between 
boundedness and total boundedness: a Banach space is finite-dimensional 
& every bounded subspace is totally bounded. 

We now turn to the problem of characterizing compact subspaces 
of @(X,R) or C(X,C). By Theorem B, we know at once that a closed 
subspace of C(X,R) or C(X,C) is compact = it is totally bounded. 
Unfortunately, however, this information is of little value in most applica- 
tions to analysis. What is needed is a criterion expressed in terms of the 
individual functions in the subspace. Furthermore, for most of the 
applications it suffices to consider only the case in which X is a compact 
metric space. We describe the relevant concept as follows. Let X bea 
compact metric space with metric d, and let A be a non-empty set of 
continuous real or complex functions defined on X. If f is a function in 
A, then by Theorem 24-F this function is uniformly continuous; that is, 
for each e€ > 0, there exists 6 > 0 such that d(z,z’) < 6=>|f(z) — 
f(z')| < ¢«. In general, 5 depends not only on ¢ but also on the function f. 
A is said to be equicontinuous if for each ¢ a 6 can be found which serves at 
once for all functions f in A, that is, if for each « > 0 there exists 6 > 0 
such that for every fin A d(z,2') < 6= |f(x) — f(z’)| <«. 


Theorem C (Ascoli’s Theorem). If X is a compact metric space, then a 
closed subspace of ©(X,R) or C(X,C) ts compact — it is bounded and 
equicontinuous. 

proor. Let d be the metric on X, and let F be a closed subspace of 
e@(X,R) or C(X,C). 

We first assume that F is compact, and we prove that it is bounded 
and equicontinuous. Problem 21-6 shows that F is bounded. We prove 
that F is equicontinuous as follows. Let «> 0 be given. Since F is 
compact, and therefore totally bounded, we can find an (¢/3)-net 
{fi, fo, ..., fn} in F. Each f, is uniformly continuous, so for each 
k=1,2,..., 7, there exists & > 0 such that d(z,2’) < & => |fi(z) — 
Se(x’)| < ¢/3. We now define 6 to be the smallest of the numbers 
51, 52, ..., 6,. If f is any function in F and f, is chosen so that ||f — 
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d(x,x') < 8 = |f(x) — f(e’)| < \f@) — fl@| + fhe) — fee) 
+ [fel2’) — f(a')| < €/3 + €/3 + €/8 = « 


This shows that F is equicontinuous. 

We now assume that F is bounded and equicontinuous, and we 
demonstrate that it is compact by showing that every sequence in it has a 
convergent subsequence. Since F is closed, and therefore complete, 
it suffices to show that every sequence in it has a Cauchy subsequence. 
As we proceed, the reader will see that our proof is similar in structure 
to the last part of the proof of Theorem A. By Problem 24-4, X has a 
countable dense subset. Let the points of this subset be arranged in a 
sequence {z:;} = {ze, 23, ... , 2, . ..}, where we start with the sub- 
script 2 for reasons which will become clear below. Now let 


8S: = (fu, fis, fis, « . «} 


be an arbitrary sequence in F. Our hypothesis that F is bounded means 
that there exists a real number K such that || {|| < K for every f in F, or 
equivalently, such that |f(x)| < K for every f in F and every z in X. 
Consider the sequence of numbers {f,;(12)}, j = 1, 2, 3,..., and 
observe that since this sequence is bounded, it has a convergent sub- 
sequence. Let S: = {foi, for, fos, . . .} be a subsequence of S; such that 
{fo;(z2)} converges. We next consider the sequence of numbers { f2;(x3)}, 
and in the same way we let S; = {fs1, fs, faz, . . .} be a subsequence of 
S2 such that {f3;(z3)} converges. If we continue this process, we get an 
array of sequences of the form 


Si = {fis, fre, fis, oe ity 
S2= (for, fox, fos, a's ‘l, 
S3 = {far fa2, fos, sn t; 


in which each sequence is a subsequence of the one directly above it, and 
for each 7 the sequence S; = {fu, fiz, fis, . . .} has the property that 
{fi(z:)} is a convergent sequence of numbers. If we define fi, fe, fs, . . . 
by fi = fit, fo = feo, fs = fas, . . . , then thesequence S = {f1, fo, fz, . . -} 
is the “diagonal” subsequence of S;. It is clear from this construction 
that for each point x; in our dense subset of X, the sequence {f,(x;)} isa 
convergent sequence of numbers. It remains only to show that S, as a 
sequence of functions in C(X,R) or C(X,C), is a Cauchy sequence. Let 
e > 0 be given. Since F is equicontinuous, there exists 6 > 0 such that 
a(x,z') < 6 = |fn(xz) — fa(z’)| < €/3 for all functions f, in S. We now 
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form the open sphere S,(z,;) with radius 6 centered on each of the z;,’s. 
Since the z,’s are dense, these open spheres form an open cover of X, and 
since X is compact, X = U2, S;(z;) for some ip. It is easy to see that 
there exists a positive integer mo such that m,n > no = |fn(xi) — falta| < 
¢/3 for all the points x2, 23, . . . , Z- Our proof is completed by the 
remark that if x is an arbitrary point of X, then an z can be found in the 
set (2,3, . . . , io} such that d(z,z,) < 6, and that therefore 


m,n 2m> f(x) yt fa(z)| < \fm(2) — fn(xi)| + lfm a) = fn(xs)| 
+ |fa(vi) — falz)| < €/3 + 3 + /3 =. 


We observe that the total boundedness in Theorem B is replaced, 
in Ascoli’s theorem, by the weaker condition of boundedness, and that the 
resulting deficiency is made up by the additional condition of equicon- 
tinuity.!_ For several applications of Ascoli’s theorem (which is some- 
times called Arzela’s theorem) to problems in analysis, see Goffman 
(13, pp. 151-156] or Kolmogorov and Fomin (26, vol. 1, secs. 17-20]. 


Problems 


1, Let A be a subspace of a complete metric space, and show that A is 
compact = A is totally bounded. 

2. Let X be a compact metric space and F a closed subspace of C(X,R) 
or C(X,C). Show that F is compact if it is equicontinuous and 
F, = {f(z):f ¢ F} is a bounded set of numbers for each point z in X. 

3. Show that R@ is not locally compact. 

4. By considering the sequence of functions in @[0,1] defined by 


fn(x) = nx 


for 0 < z < 1/n, fa(z) = 1 for 1/n < x < 1, show that ©[0,1] is not 
locally compact. 


1 The following terminology is often used with Ascoli’s theorem. Let F be any 
non-empty set of real or complex functions defined on an arbitrary non-empty set X. 
The statement that a function f in F is bounded means, of course, that there exists a 
real number K such that |f(z)| < K for every zin X. The functions in F are often 
said to be uniformly bounded (or F is called a uniformly bounded set of functions) if 
there exists a single K which works in this way for all f’s in F, ie., if there isa K 
such that |f(z)| < K for every z in X and every f in F. If we were to use this 
expression, Ascoli’s theorem would take the following form: if X is a compact metric 
space, then a closed subspace of €(X,R) or €(X,C) is compact < it is uniformly 
bounded and equicontinuous. The uniform boundedness here is merely boundedness 
as a subset of the metric space C(X,R) or €(X,C). 


CHAPTER FIVE 


Separation 


A topological space may be very sparsely endowed with open sets. 
As we know, some spaces have only two, the empty set and the full space. 
In a discrete space, on the other hand, every set is open. Most of the 
familiar spaces of geometry and analysis fall somewhere in between these 
two artificial extremes. The so-called separation properties enable us to 
state with precision that a given topological space has a rich enough 
supply of open sets to serve whatever purpose we may have in mind. 

The separation properties are of concern to us because the supply 
of open sets possessed by a topological space is intimately linked to its 
supply of continuous functions; and since continuous functions are of 
central importance in topology, we naturally wish to guarantee that 
enough of them are present to make our discussions fruitful. If, for 
instance, the only open sets in a topological space are the empty set and 
the full space, then the only continuous functions present are the con- 
stants, and very little of interest can be said about these. In general 
terms, the more open sets there are, the more continuous functions a space 
has. Discrete spaces have continuous functions in the greatest possible 
abundance, for all functions are continuous. However, few really 
important spaces are discrete, so this goes a bit too far. The separation 
properties make it possible for us to be sure that our spaces have enough 
continuous functions without committing ourselves to the excesses of 


discrete spaces. 
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26. T1-SPACES AND HAUSDORFF SPACES 


One of the most natural things to require of a topological space is 
that each of its points be a closed set.!_ The separation property which 
relates to this is the following. A T1-space is a topological space in which, 
given any pair of distinct points, each has a neighborhood which does 
not contain the other.? It is obvious that any subspace of a 71-space 
is also a T;-space. Our first theorem shows that 7-spaces are precisely 
those topological spaces in which points are closed. 


Theorem A. A topological space is a T1-space = each point is a closed set. 
proor. If X is a topological space, then an arbitrary point z in X is 
closed — its complement is open = each point y different from x has a 
neighborhood which does not contain x = X is a T1-space. 


Our next separation property is slightly stronger. A Hausdorff space 
is a topological space in which each pair of distinct points can be separated 
by open sets, in the sense that they have disjoint neighborhoods. Every 
Hausdorff space is clearly a 7;-space, and every subspace of a Hausdorff 
space is also a Hausdorff space. 


Theorem B. The product of any non-empty class of Hausdorff spaces is a 
Hausdorff space. 

pRooF. Let X = P;X; be the product of a non-empty class of Hausdorff 
spaces X;. If z = {x;} and y = {y,} are two distinct points in X, then 
we must have 2,, ~ y;, for at least one index zp. Since Xi, is a Hausdorff 
space, x; and y;, can be separated by open sets in X,,. These two 
disjoint open subsets of X;, give rise to two disjoint sets in the defining 
open subbase for X, each of which contains one of the points x and y. 


Most of the important facts about Hausdorff spaces depend on the 
following theorem. 


Theorem C. In a Hausdorff space, any point and disjoint compact sub- 
space can be separated by open sets, in the sense that they have disjoint 
neighborhoods. 


proor. Let X be a Hausdorff space, x a point in X, and C a compact 
subspace of X which does not contain x. We construct a disjoint pair of 


1It is customary here to drop the distinction between a point z in a space and the 
set {x} which contains only that point. This convention often makes it possible to 
avoid cumbersome modes of expression, and we shall use it freely. 

2 The T,-space nomenclature, fori = 0,1, . . . , 5, was introduced by Alexan- 
droff and Hopf in their famous treatise [2]. The 7 refers to the German word 
Trennungsaziom, which means “separation axiom.” The term T)-space is the only 
one of these which is still in general use. 
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open sets G and H such thatze Gand CCH. Let y be a point in C. 
Since X is a Hausdorff space, z and y have disjoint neighborhoods G, and 
H,. If we allow y to vary over C, we obtain a class of H,’s whose union 
contains C; and since C is compact, some finite subclass, which we denote 


by {M1, Ha, ... , H,}, is such that C C UL, A; Tf Gi, Ga, . 2. , Ga 
are the neighborhoods of x which correspond to the H,’s, we put 
G=N2LG 


and H = 2, H; and observe that these two sets have the required 
properties. 


In Theorem 21-A we proved that every closed subspace of a compact 
space is compact, and in Problem 21-4 we saw that a compact subspace of 
a compact space need not be closed. We now use the preceding theorem 
to show that compact subspaces of Hausdorff spaces are always closed. 


Theorem D. Every compact subspace of a Hausdorff space is closed. 
PROOF. Let C bea compact subspace of a Hausdorff space X. We prove 
that C is closed by showing that its complement C’ is open. C’ is open 
if it is empty, so we may assume that it is non-empty. Let x be any 
pointinC’. By Theorem C, z has a neighborhood G such that z¢ G CC". 
This shows that C’ is a union of open sets and is therefore open itself. 


One of the most useful consequences of this result is 


Theorem E. A one-to-one continuous mapping of a compact space onto a 
Hausdorff space is a homeomorphism. 

proor. Let f:X — Y bea one-to-one continuous mapping of a compact 
space X onto a Hausdorff space Y. We must show that f(@) is open in 
Y whenever G is open in X, and for this it suffices to show that f(F) is 
closed in Y whenever F is closed in X. If F is empty, f(F) is also empty 
and therefore closed, so we may assume that F is non-empty. By 
Theorem 21-A, F is a compact subspace of X; by Theorem 21-B, f(F) isa 
compact subspace of Y; and we complete the proof by using the preceding 
theorem to conclude that f(F) is a closed subspace of Y. 


Compact Hausdorff spaces are among the most important of all 
topological spaces, and in the following sections and chapters we shall 
become thoroughly acquainted with their major properties. 


Problems 


1. Show that the topological space defined in Example 16-5 is not a 
T\-space, 
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2. Show that the topological space defined in Example 16-4 is a T;-space 
but not a Hausdorff space. 

3. Show that any finite 7',-space is discrete. 

4. If X is a T,-space with at least two points, show that an open base 
which contains X as a member remains an open base if X is dropped. 

5. Let X be a topological space, Y a Hausdorff space, and A a subspace 
of X. Show that a continuous mapping of A into Y has at most one 
continuous extension to a mapping of A into Y. Problem 17-2 is a 
special case of this statement. 

6. Iffisa continuous mapping of a topological space X into a Hausdorff 
space Y, prove that the graph of f is a closed subset of the product 
XX Y. 

7. Let X be any non-empty set, and prove that in the lattice of all 
topologies on X each chain has at most one compact Hausdorff 
topology asa member. (It is interesting to speculate about whether 
a compact Hausdorff topology can be defined on an arbitrary non- 
empty set.) 

8. Let X be an arbitrary topological space and {z,} a sequence of 
points in X. This sequence is said to be convergent if there exists a 
point z in X such that for each neighborhood G of z a positive integer 
mo can be found with the property that z, is in G@ for all n > no. 
The point z is called a limit of the sequence, and we say that z, 
converges to x (and symbolize this by z,— 2). 

(a) Show that in Example 16-3 any sequence converges to every 
point of the space. This is the reason why the above point x 
is called a limit instead of the limit. 

(6) If X is a Hausdorff space, show that every convergent sequence 
in X has a unique limit. 

(c) Show that if f:X — Y is a continuous mapping of one topo- 
logical space into another, then z, — xin X = f(z,) > f(z) im Y. 
Prove that the converse of this is true if each point in X has a 
countable open base.! 


27. COMPLETELY REGULAR SPACES AND NORMAL SPACES 


Let X be an arbitrary topological space, and consider the set C(X,R) 
of all bounded continuous real functions defined on X. If for each pair 
of distinct points z and y in X there exists a function f in C(X,R) such 
that f(z) ~ f(y), we say that C(X,R) separates points. It is easy to see 

1The facts brought out in this problem are the main reasons why the concept 


of a convergent sequence is not very important in the general theory of topological 
spaces, 
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that if C(X,R) does separate points, then X is necessarily a Hausdorff 
space; for assuming that f(z) < f(y) and that r is a real number such that 
S(z) <r < f(y), then the sets {z:f(z) <r} and {z:f(z) > r} are disjoint 
neighborhoods of x and y. 

It is convenient to strengthen this separation property slightly by 
allowing one of the points to be an arbitrary closed subspace of X. A 
completely regular space is a T,-space X with the property that if z is any 
point in X and F any closed subspace of X which does not contain z, then 
there exists a function f in C(X,R), all of whose values lie in the closed 
unit interval [0,1], such that f(z) = Oandf(F) = 1. Itis worth noticing 
that since constants are continuous, we could just as well have required 
here that f be 1 at x and 0 on F, for the function g = 1 — f has these 
properties. We may think of completely regular spaces as T1-spaces in 
which continuous functions separate points and disjoint closed subspaces. 
Since points are closed in a completely regular space, it is permissible to 
take the closed subspace F to be a point, and it is clear by the above 
paragraph that every completely regular space is a Hausdorff space. 
It is also easy to see that every subspace of a completely regular space is 
completely regular. In Sec. 30 we give an explicit characterization of all 
completely regular spaces in terms of product spaces. 

Our next (and last) separation property is similar to that which 
defines a Hausdorff space, except that it applies to disjoint closed sets 
instead of merely to distinct points. A normal space is a Ty-space in 
which each pair of disjoint closed sets can be separated by open sets, in 
the sense that they have disjoint neighborhoods. We shall see in the 
next section (as a consequence of Urysohn’s lemma) that every normal 
space is completely regular. 

Figure 25 is intended to illustrate and clarify the relations among our 
various separation properties: a topological space which possesses any one 
property, in the order of their definition, also possesses all properties 
which precede it; in other words, they have been defined in order of 
increasing strength. The figure also indicates that metric spaces and 
compact Hausdorff spaces are normal. We established the first of these 
facts in Problem 11-16, and we now prove the second. 


Theorem A. Every compact Hausdorff space is normal. 

PRooF. Let X be a compact Hausdorff space, and A and B disjoint 
closed subsets of X. We must produce a disjoint pair of open sets G and 
H such that A CG and B CH. If either closed set is empty, we can 
take the empty set as a neighborhood of it and the full space as a neigh- 
borhood of the other. We may therefore assume that both A and B are 
non-empty. Since X is compact, A and B are disjoint compact sub- 
spaces of X. Letxbeapointof A. By Theorem 26-C and our hypothe- 
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sis that X is Hausdorff, x and B have disjoint neighborhoods G, and 
Hz. If we allow zx to vary over A, we obtain a class of G,’s whose union 
contains A; and since A is compact, some finite subclass, which we denote 
by {Gi, Go, ... , Ga}, is such that AC U2, G,. If Hi, Ha, ..., An 
are the neighborhoods of B which correspond to the G;,’s, it is clear that 
G = U2, G; and H = Mi, H, are disjoint neighborhoods of A and B. 


Topological spaces 
T,- spaces 
Hausdorff spaces 


Completely regular spaces 


Normal spaces 
Metric 


Compact Spaces 
Hausdorff 
spaces 


Fig. 25. The separation properties, 


In Sec. 29 we investigate the manner in which normal spaces, 
compact Hausdorff spaces, and metric spaces are related to one another. 


Problems 


1. Show that a closed subspace of a normal space is normal. 

2. Let X bea 71-space, and show that X is normal = each neighborhood 
of a closed set F contains the closure of some neighborhood of F. 

3. Assume, as Fig. 25 suggests, that a compact Hausdorff space X is 
completely regular and that therefore @(X,R) separates points. 
Use this to prove that the weak topology gencrated by @(X,R) 
equals the given topology. Show further that if S is any subset of 
@(X,R) which also separates points, then the weak topology generated 
by S also equals the given topology. 

4, Let X be a completely regular space, and show from the definition 
that the weak topology generated by C(X,R) equals the given 
topology. 
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28. URYSOHN'S LEMMA AND THE TIETZE EXTENSION THEOREM 


As we suggested in the introduction to this chapter, one of the main 
purposes served by assuming that a topological space is rich in open sets is 
to guarantee that it is also rich in continuous functions. The following 
is the fundamental theorem in this direction. 


Theorem A (Urysohn’s Lemma). Let X be a normal space, and let A and 
B be disjoint closed subspaces of X. Then there exists a continuous real 
function f defined on X, all of whose values lie in the closed untt interval 
[0,1], such that f(A) = 0 and f(B) = 1. 

pRooF. B?’ is a neighborhood of the closed set A, so by the normality of 
X and Problem 27-2, A has a neighborhood U;, such that 


AC Uy SC Uy C B’. 


Uy, and B’ are neighborhoods of the closed sets A and Uy, so in the same 
way there exist open sets Uy and Uy, such that 


AC Uy € Uy © Uy C Uy C Uy S Oy C B’. 


If we continue this process, for each dyadic rational number of the form 
t = m/2" (where n = 1, 2,3, ...andm=1,2,...,2"— 1) we have 
an open set of the form U;, such that 


i<beACU,CT, CU,zc Uy, CB’. 
We now define our function f by f(z) = 0 if x is in every U; and 
S(z) = sup {t:¢ U,} 


otherwise. It is clear that the values of f lie in [0,1], and that f(A) = 0 
and f(B) = 1. All that remains to be proved is that f is continuous. 
All intervals of the form [0,a) and (a,1], where 0 < a < 1, constitute an 
open subbase for [0,1]. It therefore suffices to show that f—1((0,a)) and 
f-((a,1}) are open. It is easy to see that f(x) < a<@ 2 isin some U; for 
t < a; and from this it follows that f-((0,a)) = {x:f(z) < a} = Urea Ui, 
which is an open set. Similarly, f(z) > a = x is outside of U, for some 
t > a; and therefore f-'((a,1]) = {x:f(z) > a} = Urse,Uy', which is an 
open set. 


It is clear from this theorem that every normal space is completely 
regular: all that is necessary is to take the closed subspace A to be a single 
point and to observe that the function f is exactly what is required in the 
definition of complete regularity. 

The following slightly more flexible form of Urysohn’s lemma will 
ve useful in applications, 
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Theorem B. Let X be a normal space, and let A and B be disjoint closed 
subspaces of X. If [a,b] ts any closed interval on the real line, then there 
exists a continuous real function f defined on X, all of whose values lie in 
[a,b], such that f(A) = a and f(B) = b. 

proor. Ifa = b, we have only to define f by f(x) = a for every x, so we 
may assume that a <b. If g is a function with the properties stated in 
Urysohn’s lemma, then f = (6 — a)g + a has the properties required by 
our theorem. 


If there is given a continuous function defined on a subspace of a 
topological space, Urysohn’s lemma has an important bearing on the 
interesting question of whether this function can be extended continuously 
to the full space. The following is the classic theorem along these lines. 


Theorem C (the Tietze Extension Theorem). Let X be a normal space, F 
a closed subspace, and f a continuous real function defined on F whose values 
lie in the closed interval {a,b]. Then f has a continuous extension f' defined 
on all of X whose values also lie in [a,b]. 

proor. If a = b, the conclusion of our theorem is obvious, so we may 
assume that a <b. We may clearly assume that [a,b] is the smallest 
closed interval which contains the range of f. Furthermore, the device 
used in the proof of Theorem B enables us to assume that a = —1 and 
b=1. We begin by defining f, to be f. The domain of fp is our closed 
subspace F,, and we define two subsets Ao and By of F by 


Ao = {z:fo(z) < —%} 


and By = {z:fo(xz) > 4%}. Ao and Bo are disjoint, non-empty, and 
closed in F; and since F is closed, they are closed in X. Ao and By are 
thus a disjoint pair of closed subspaces of X, and by Theorem B there 
exists a continuous function go: X — [—14,}4] such that go(Ao) = —4 
and go(Bo) = 4%. Wenext define f, on F by fi = fo — go, and we observe 
that |fi(z)| < 4. If Ai = (x:fi(x) < (—¥4)(34)} and 


Bi = {z:filz) = (4) (%)}, 


then in the same way as above there exists a continuous function 


gi:X — [(—¥4) 24), 04)(8)] such that gi(A1) = (—34)(24) and 
gi(B:) = (4) (34). 


We next define fz on F by fo = fi — gi = fo — (go + 91), and we observe 
that |fe(z)| < (23)?. By continuing in this manner, we get a sequence 


{fo, fr, fo, . . .} defined on F such that |f,(x)| < (24)", and a sequence 
{go, 91, gz, . . .} defined on X such that |g,(x)| < (4)(%4)", with the 


property that on F we have fa = fo— (Qo tgit-*- +n). We 
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now define s, by S&: = go + g1 + * °° + gn-1, and we regard the s,’s as 
the partial sums of an infinite series of functions in C(X,R). We know 
that ©(X,R) is complete, so by |g,(z)] < (24)(3)" and the fact that 
D°_, (14)(24)" = 1, we see that s, converges uniformly on X to a bounded 
continuous real function f’ such that |f’(z)| <1. We conclude our 
proof by noting that since |f,(x)| < (2$)", 8, converges uniformly on 
F to fo = f, and that therefore f’ equals f on F and is a continuous exten- 
sion of f to the full space X which has the desired property. 


It is of some interest to observe that this theorem becomes false if 
we omit the assumption that the subspace F is closed. This is easily 
seen by means of the following example. Let X be the closed unit 
interval [0,1], F the subspace (0,1], and f the function defined on F by 
f(x) = sin (1/z). X is clearly normal, F is not closed as a subspace 
of X, and f cannot be extended continuously to X in any manner 
whatsoever. 


Problems 


1. In the text we used Urysohn’s lemma as a tool to prove Tietze’s 
theorem. Reverse this process, and deduce Urysohn’s lemma from 
Tietze’s theorem. 

2. State and prove a generalization of Tietze’s theorem which relates to 
functions whose values lie in R*. 

3. Justify the assertion in the last paragraph of the text that the function 
defined there cannot be extended continuously to X. 


29. THE URYSOHN IMBEDDING THEOREM 


In Chap. 3, we generalized metric spaces to topological spaces. 
We now reverse this procedure and seek out simple conditions which 
guarantee that a topological space is essentially a metric space, that is, 
which imply that it is metrizable. Problem 16-12 shows that we must 
look for properties of a topological space X which enable us to construct 
a homeomorphism f of X onto a subspace of some metric space; for the 
metric on this subspace can then be carried back by f to X, and we can 
infer that X is metrizable. The simplest property of this kind is discrete- 
ness; for if X is a discrete space, then its underlying set of points, equipped 
with the metric defined in Example 9-1, is a homeomorphic image of 
X under the identity mapping. We can lift our discussion to a more 
meaningful level by observing that since every metric space is normal, 
normality must be among the properties assumed of X, or it must be 
implied by them. 
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As motivation for our main theorem, we note that since the metric 
space R* is second countable, every subspace of it is also second countable. 
It turns out that second countability, in addition to normality, suffices 
to guarantee that a topological space is homeomorphic to a subspace 
of R». In effect, we imbed such a space homeomorphically in R=. 


Theorem A (the Urysohn Imbedding Theorem). Jf X is a second count- 
able normal space, then there exists a homeomorphism f of X onto a subspace 
of R», and X is therefore metrizable. 
ProoF. We may assume that X has infinitely many points, for other- 
wise it would be finite and discrete, and clearly homeomorphic to 
any subspace of R* with the same number of points. Since X is 
second countable, it has a countably infinite open base {Gi, G2, Gs, . . .} 
each of whose sets is different from the empty set and the full space. 
If G; and x € G; are given, then by normality there exists a G; such that 
z&€G; CG; CG; The set of all ordered pairs (G;,G,) such that G; C G; 
is countably infinite, and we can arrange them in a sequence P,, 
P2,..., Pa.... By Urysohn’s lemma, for each ordered pair 
P, = (G,G;) there exists a continuous real function f,:X — [0,1] such 
that f.(G;) = 0 and f,(G,’) = 1. For each x in X we define f(x) to be 
the sequence given by f(x) = {fi(x), fo(x)/2, ... ,fn(z)/n,...}. Tf 
we recall that the infinite series 27_, 1/n? converges, it is easy to see that 
f is a one-to-one mapping of X into R». It remains to be proved that 
f and f-! are continuous. 

To prove that f is continuous, it suffices to show that given 2 in 
X and « > 0, there exists a neighborhood H of xo such that ye H > 
llf(y) — f(xo)|| <«. Since an infinite series of functions converges 
uniformly if its terms are bounded by the terms of a convergent 
infinite series of constants, it is easy to see that there exists a positive 
integer 7» such that for every y in X we have 


WFQ) — f(wo) ll? = Zr ILfa(y) — fa(o)]/n? 
< Zea I[fa(y) — fa(@o)]/n|? + €?/2. 


By the continuity of the f,’s, foreach n = 1, 2, . . . , mo there exists a 
neighborhood H, of xo such that y € Hn = |[fn(y) — fn(20)]/n|? < e?/2no. 
If we define H by H = (7°, An, it is clear that H is a neighborhood of zy 
such that y € H = |[f(y) — f(x0)||? < e = \[f(y) — f(ao)|| <«. 

We conclude our proof by showing that f~! is continuous as a map- 
ping of f(X) onto X. It suffices to show that given x» in X and a basic 
neighborhood G; of xo, there exists e > 0 such that ||f(y) — f(zo)|| < 
e=>y€G;. G; is the second member of some ordered pair P,, = (G:i,Gj) 
such that 1¢€G; CG; CG; If we choose e < 1/2m, then we see 
that ||f(y) — f(xo)|| < ¢€ => 2ra1 |[fa(y) — fa(2o)]/n|? < (1/210)? = |fng(y) 
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— fa(to)| < 4%. Since xo is in Gj, fn,(0) = 0, and therefore |f,,(y)| < 4. 
Since f,,(G;’) = 1, we see that y is in G;. 


This theorem puts us in a position to answer several natural questions 
which arise in connection with the inner portions of Fig. 25. We ask 
the reader to deal with these matters in the following problems. 


Problems 


1. We know that every metric space is normal, and also that a normal 
space, if second countable, is metrizable. Give an example of a 
normal space which is not metrizable (hint: see Problem 22-4). This 
shows that metrizable spaces cannot be characterized among topologi- 
cal spaces by the property of normality. 

2. Among normal spaces, second countability implies metrizability. 
Give an example of a metric space which is not second countable. 
This shows that metrizable spaces cannot be characterized among 
normal spaces by the property of second countability. 

3. Show that a compact Hausdorff space is metrizable <= it is second 
countable.! 


30. THE STONE-CECH COMPACTIFICATION 


In the preceding section we showed that if a normal space is second 
countable, then it can be imbedded as a subspace in the familiar metric 
space R». We now develop a similar imbedding theorem for com- 
pletely regular spaces. 

In order to motivate this theorem properly, we remark that if X 
is a topological space which occurs as a subspace of a compact Hausdorff 
space Y, then since Y is completely regular, X is also completely regular, 
and is a dense subspace of the compact Hausdorff space X. We see in 
this way that many completely regular spaces are dense subspaces of 
compact Hausdorff spaces. Our purpose in this section is to show that 
any completely regular space X can be imbedded as a dense subspace in 
a special compact Hausdorff space denoted by 6(X), and that 6(X) has 
the remarkable property that every bounded continuous real function 
defined on X has a unique extension to a bounded continuous real 
function defined on 8(X). 


1 The results of this section have characterized metrizable spaces among second 
countable spaces (by normality) and among compact Hausdorff spaces (by second 
countability), but not among topological spaces in general. This more difficult 
problem was solved by Smirnov [38]. For an exposition of his solution, see Kelley 
(25, pp. 126-130]. 
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How truly remarkable this extension property is can be seen by 
considering the example given at the end of Sec. 28. Here the completely 
regular space X is the interval (0,1). This space is clearly a dense 
subspace of the compact Hausdorff space [0,1]. The function f defined 
on (0,1) by f(z) = sin (1/z) is a bounded continuous real function 
defined on X, but it cannot be extended continuously to [0,1]. The 
space [0,1], though it is a compact Hausdorff space which contains (0,1] as 
a dense subspace, is evidently not the space 6(X). The latter is much 
too complicated for any simple description of it to be possible. 

Before we start our discussion, we recall two items from the previous 
sections: 

(1) if X is a completely regular space, then the weak topology 

generated by @C(X,R) equals the given topology; 

(2) the relative topology on a subspace of a product space equals 
the weak topology generated by the restrictions of the projections 
to that subspace. 

These facts (they are Problems 27-4 and 22-2) are the basic principles on 
which the following analysis rests. 

We begin with an arbitrary topological space X and the set C(X,R) 
of all bounded continuous real functions defined on X. Let the func- 
tions in @(X,R) be indexed by a set of indices 2, so that C(X,R) = {fi}. 
For each index 7, let Z; be the smallest closed interval which contains the 
range of the function f;,. Each J; is a compact Hausdorff space, so their 
product P = P,I; is also a compact Hausdorff space, and every subspace 
of P is completely regular. We next define a mapping f of X into this 
product space by means of f(z) = {f:(x)}, that is, in such a way that 
f(x) is that point in the product space P whose ith coordinate is the real 
number fi(z). By Problem 22-3 and the fact that p:f = f; for each pro- 
jection p;, it is clear that f is a continuous mapping of X into P. 

We now assume that @(X,R) separates the points of X. This is a 
weaker assumption than complete regularity and is exactly the require- 
ment that f be a one-to-one mapping. At this stage, we use f to replace 
f(X) as a set by X; that is, we imbed X in P asa set. X is thus a sub- 
set of P which has two topologies: its own, and the relative topology 
which it has as a subspace of P. We observe two features of this situa- 
tion. First, since f is continuous, the given topology on X is stronger 
than itsrelative topology. Second, C(X,R) is precisely the set of all restric- 
tions to X of the projections p; of P onto its coordinate spacesJ;. Itis now 
clear that if X is completely regular, so that by statement (1) its given 
topology equals the weak topology generated by @(X,R), then by state- 
ment (2) its given topology equals its relative topology, and X can be 
regarded as a subspace of P. 

In accordance with these ideas, we now assume that X is completely 
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regular, and we fully identify it, both as a set and as a topological space, 
with the subspace f(X) of P. It is easy to see that the closure X of X 
in P is a compact Hausdorff space in which X is imbedded as a dense 
subspace. Furthermore, each f, in C(X,#)—that is, each projection p, 
restricted to X—has an extension to a bounded continuous real func- 
tion defined on X; this extension is p, restricted only to X, and it is 
unique by Problem 26-5. The space X is commonly denoted by 8(X). 
We summarize these results in the following theorem. 


Theorem A. Let X be an arbitrary completely regular space. Then there 
exists a compact Hausdorff space B(X) with the following properties: (1) 
X is a dense subspace of B(X); (2) every bounded continuous real function 
defined on X has a unique extension to a bounded continuous real function 
defined on B(X). 


We shall prove in Chap. 14 that the space 6(X) is essentially unique, 
in the sense that any other compact Hausdorff space with properties (1) 
and (2) is homeomorphic to B(X). It is called the Stone-Cech compacti- 
fication of the given completely regular space. 

Even before our work in this section, it was clear that every sub- 
space of a product of closed intervals is completely regular. It is worthy 
of special emphasis that the above discussion shows, conversely, that 
every completely regular space is homeomorphic to a subspace of such a 
product. 


Problems 


1. If X is completely regular, show that every bounded continuous com- 
plex function defined on X has a unique extension to a bounded 
continuous complex function defined on 6(X). 

2. Every closed subspace of a product of closed intervals is a compact 
Hausdorff space. Show, conversely, that every compact Hausdorff 
space is homeomorphic to a closed subspace of such a product. 

3. Prove the following generalization of the Tietze extension theorem. 
If X is a normal space, F a closed subspace of X, and f a continuous 
mapping of F into a completely regular space Y, then f can be 
extended continuously to a mapping f’ of X into a compact Hausdorff 
space Z which contains Y as a subspace. 


1 For Stone’s own version of these ideas, see his paper [39]. 


CHAPTER SIX 


Connectedness 


From the intuitive point of view, a connected space is a topological 
space which consists of a single piece. This property is perhaps the 
simplest which a topological space may have, and yet it is one of the 
most important for the applications of topology to analysis and geometry. 

On the real line, for instance, intervals are connected subspaces, and 
we shall see that they are the only connected subspaces. Continuous 
real functions are often defined on intervals, and functions of this kind 
have many pleasant properties. For example, such a function assumes as 
a value every number between any two of its values (the Weierstrass 
intermediate value theorem) ; furthermore, its graph is a connected subspace 
of the Euclidean plane. Connectedness is also a basic notion in complex 
analysis, for the regions on which analytic functions are studied are 
generally taken to be connected open subspaces of the complex plane. 

In the portion of topology which deals with continuous curves and 
their properties, connectedness is of great significance, for whatever else a 
continuous curve may be, it is certainly a connected topological space. 
We describe some of the central ideas of this field in Appendix 2. 

Spaces which are not connected are also interesting. One of the 
outstanding characteristics of the Cantor set is the extreme degree in 
which it fails to be connected. Much the same is true of the subspace of 
the real line which consists of all rational numbers. These spaces are so 
badly disconnected that they are almost granular in texture. 

Our purpose in this chapter is to convert these rather vague notions 
into precise mathematical ideas, and also to establish the fundamental 
facts in the theory of connectedness which rests upon them. 
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31. CONNECTED SPACES 


A connected space is a topological space X which cannot be repre- 
sented as the union of two disjoint non-empty open sets. If X = AWB, 
where A and B are disjoint and open, then A and B are also closed, so 
that X is the union of two disjoint closed sets, and conversely. We see 
by this that X is connected = it cannot be represented as the union of two 
disjoint non-empty closed sets. It is also clear that the connectedness 
of X amounts to the condition that @ and X are its only subsets which 
are both open and closed. A connected subspace of X is a subspace Y 
which is connected as a topological space in its own right. By the defini- 
tion of the relative topology on Y, this is equivalent to the condition that 
Y is not contained in the union of two open subsets of X whose inter- 
sections with Y are disjoint and non-empty. 

Our space X is said to be disconnected if it is not connected, that is, 
if it can be represented in the form X = AU B, where A and B are 
disjoint, non-empty, and open; and if X is disconnected, a representation 
of it in this form (there may be many) is called a disconnection of X. 

We begin by proving a theorem which supports a considerable part 
of the theory of connectedness. 


Theorem A. A subspace of the real line R is connected <= it is an interval. 
In particular, R is connected. 

PROOF. Let X beasubspaceofR. We first prove that if X is connected, 
then it is an interval. We do this by assuming that X is not an interval 
and by using this assumption to show that X is not connected. To say 
that X is not an interval is to say that there exist real numbers z, y, z such 
that x < y < z,zandzarein X, and yisnotin X. Itis easy to see from 
this that X = [XN (—~,y)] U[X/N (y,+ )] is a disconnection of 
X, so X is disconnected. 

We complete the proof by showing that if X is an interval, then it is 
necessarily connected. Our strategy here is to assume that X is dis- 
connected and to deduce a contradiction from this assumption. Let 
X = AB beadisconnection of X. Since A and B are non-empty, we 
can choose a point z in A and a point zin B. A and B are disjoint, so 
x ~ 2, and by altering our notation if necessary, we may assume that 
x <z. Since X is an interval, [x,z] C X, and each point in [z,z] is in 
either A or B. We now define y by y = sup ([z,z] (\ A). It is clear 
that ¢c < y <z,so yisin X. Since A is closed in X, the definition of 
y shows that yisin A. From this we conclude thaty <z. Again by the 
definition of y, y + ¢ is in B for every e > 0 such that y + « < z, and 
since B is closed in X, y isin B. We have proved that y is in both A and 
B, which contradicts our assumption that these sets are disjoint. 
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Our next theorem asserts that the property of connectedness is 
preserved by continuous mappings. 


Theorem B. Any continuous image of a connected space is connected. 
proor. Let f:X — Y be a continuous mapping of a connected space 
X into an arbitrary topological space Y. We must show that f(X) is 
connected as a subspace of Y. Assume that f(X) is disconnected. As we 
have seen, this means that there exist two open subsets G and H of Y 
whose union contains f(X) and whose intersections with f(X) are disjoint 
and non-empty. This implies, however, that X = f—'(G) Uf-'(A) isa 
disconnection of X, which contradicts the connectedness of X. 


As a direct consequence of the two theorems just proved, we have the 
following generalization of the Weierstrass intermediate value theorem. 


Theorem C. The range of a continuous real function defined on a connected 
space is an interval. 


It is a trivial observation that any two discrete spaces with the same 
number of points are essentially identical; for any one-to-one mapping of 
one onto the other (there is at least one) is a homeomorphism, and we may 
think of them as differing only in the symbols used to designate their 
points. It is in this sense that there is only one discrete space with any 
given number of points. The discrete two-point space, which is obviously 
disconnected, is a useful tool in the theory of connectedness. We denote 
its points by the symbols 0 and 1, and we think of them as real numbers. 


Theorem D. A topological space X is disconnected <= there exists a con- 
tinuous mapping of X onto the discrete two-point space {0,1}. 
proor. If X isdisconnected and X = A U Bisadisconnection, then we 
define a continuous mapping f of X onto {0,1} by the requirement that 
f(z) = 0 if z isin A and f(z) = Lif xisin B. This is a valid definition 
by the fact that A and B are disjoint and their union is X. Since A and 
B are non-empty and open, f is clearly onto and continuous. 

On the other hand, if there exists such a mapping, then X is dis- 
connected ; for if X were connected, Theorem B would imply that {0,1} is 
connected, and this would be a contradiction. 


This result is a useful tool for the proof of our next theorem. 


Theorem E. The product of any non-empty class of connected spaces is 
connected. 

proor. Let {X,} be a non-empty class of connected spaces, and form 
their product X = P;X;. We assume that X is disconnected, and we 
deduce a contradiction from this assumption. By Theorem D, there 
exists a continuous mapping f of X onto the discrete two-point space 
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{0,1}. Let a = {a;} be a fixed point in X, and consider a particular 
indexi,. We define a mapping f;, of X;, into X by means of f:,(z;,.) = {y:}, 
where y, = a; for 7 #7, and y;, = 2: This is clearly a continuous 
mapping, so ff;, is a continuous mapping of X;,, into {0,1}. Since X,, is 
connected, we see by Theorem D that ff;, is constant and that 


(fir) (zi) i= S(a) 


for every point 2;, in X;,. This shows that f(x) = f(a) for all z’s in X 
which equal a in all coordinate spaces except X;,. By repeating this 
process with another index 72, etc., we see that f(z) = f(a) for all z’sin X 
which equal a in all but a finite number of coordinate spaces. The set of 
all x’s of this kind is a dense subset of X, so by Problem 26-5, f is a con- 
stant mapping. This contradicts the assumption that f maps X onto 
{0,1}, and completes the proof. 


As an application of this result, we show that all finite-dimensional 
Euclidean and unitary spaces are connected. 


Theorem F. The spaces R* and C* are connected. 

PROOF. We showed in the proof of Theorem 23-B that R*, as a topo- 
logical space, can be regarded as the product of 7 replicas of the real 
line R. We have seen in Theorem A that 2 is connected, so R* is con- 
nected by Theorem E. We next prove that C* and R®* are essentially 
the same as topological spaces by exhibiting a homeomorphism f of 
C* onto R. Let z = (21, 22, . . . , én) be an arbitrary element in C*, 
and let each coordinate z, be written out in the form z, = a, + iby, where 
a, and 0, are its real and imaginary parts. We define f by 


F(z) = (a1, bi, Qe, be, se 6 y An, bn). 
f is clearly a one-to-one mapping of C" onto R*, and if we observe that 
llf()|| = lll, it is easy to see that f isa homeomorphism. The fact that 


R* is connected now shows that C* is also connected. 


The techniques developed in the next section will make it possible 
to give an easy proof of a much more general theorem than this, to the 
effect that any Banach space is connected. 


Problems 


1. Show that a topological space is connected < every non-empty proper 
subset has a non-empty boundary. 

2. Show that a topological space X is connected = for every two points 
in X there is some connected subspace of X which contains both. 
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3. Prove that a subspace of a topological space X is disconnected © it 
can be represented as the union of two non-empty sets each of which 
is disjoint from the closure (in X) of the other. 

4. Show that the graph of a continuous real function defined on an 
interval is a connected subspace of the Euclidean plane. 

5. Show that if a connected space has a non-constant continuous 
real function defined on it, then its set of points is uncountably 
infinite. 

6. If X isa completely regular space, use Theorem D to prove that X is 
connected <= 8(X) is connected. 


32. THE COMPONENTS OF A SPACE 


If a space is not itself connected, then the next best thing is to be 
able to decompose it into a disjoint class of maximal connected subspaces. 
Our present objective is to show that this can always be done. 

A maximal connected subspace of a topological space, that is, a 
connected subspace which is not properly contained in any larger con- 
nected subspace, is called a component of the space. A connected space 
clearly has only one component, namely, the space itself. In a discrete 
space, it is easy to see that each point is a component. 

The following two theorems will be useful in obtaining the desired 
decomposition for a general space. 


Theorem A. Let X be a topological space. If {A,} ts a non-empty class 
of connected subspaces of X such that (\;A; is non-empty, then A = UA; ts 
also a connected subspace of X. 

PRoor. Assume that A isdisconnected. This means that there exist two 
open subsets G and H of X whose union contains A and whose intersec- 
tions with A are disjoint and non-empty. All the A,’s are connected, 
and each lies in G LU H, so each A, lies entirely in G or entirely in H and is 
disjoint from the other. Since (,;A, is non-empty, either all the A,’s lie 
in G and are disjoint from H, or all lie in H and are disjointfromG. We 
see by this that A itself is disjoint from either G or H, and this contradic- 
tion shows that our assumption that A is disconnected is untenable. 


Theorem B. Let X be a topological space and A a connected subspace of X. 
If B is a subspace of X such that AC BCA, then B is connected; in 
particular, A is connected. 

proor. Assume that B is disconnected, that is, that there exist two open 
subsets G and H of X whose union contains B and whose intersections 
with B are disjoint and non-empty. Since A is connected and contained 
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in GU dH, A is contained in either G or H and is disjoint from the other. 
Let us say, just to be specific, that A is disjoint from H. This implies 
that A is also disjoint from H, and since B C A, B is disjoint from H. 
This contradiction shows that B cannot be disconnected, and proves our 
theorem. 


We are now in a position to state and prove the main facts about 
components. 


Theorem C. If X is an arbitrary topological space, then we have the fol- 
lowing: (1) each point in X is contained in exactly one component of X; (2) 
each connected subspace of X is contained in a component of X; (3) a 
connected subspace of X which is both open and closed is a component of X; 
and (4) each component of X is closed. 

PROOF. To prove (1), let x be a point in X. Consider the class {C;} of 
all connected subspaces of X which contain x. This class is non-empty, 
since z itself is connected. By Theorem A, C = UC, is a connected 
subspace of X which contains z. C is clearly maximal, and therefore a 
component of X, because any connected subspace of X which contains 
C is one of the C,’s and is thus contained in C. Finally, C is the only 
component of X which contains z. For if C* is another, it is clearly 
among the C;’s and is therefore contained in C, and since C* is maximal as 
a connected subspace of X, we must have C* = C. 

Part (2) is a direct consequence of the construction in the above 
paragraph, for by this construction, a connected subspace of X is con- 
tained in the component which contains any one of its points. 

We prove (3) as follows. Let A be a connected subspace of X which 
is both open and closed. By (2), A is contained in some component C. 
If A is a proper subset of C,, then it is easy to see that 


C=(CNAU(CNA’ 


is a disconnection of C. This contradicts the fact that C, being a com- 
ponent, is connected, and we conclude that A = C. 

Part (4) follows immediately from Theorem B; for if a component 
C is not closed, then its closure C is a connected subspace of X which 
properly contains C, and this contradicts the maximality of C as a con- 
nected subspace of X. 


In view of parts (3) and (4) of this theorem, it is natural to ask if a 
component of a space is necessarily open. The answer is no, as the 
following example shows. Let X be the subspace of the real line which 
consists of all rational numbers. We observe two facts about X. First, 
if x and z are any two distinct rationals, and if x < z, then there exists 
an irrational y such that x < y < z, and therefore 


X =[XN (—#y)] UIXO GY, +2)] 
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is a disconnection of X which separates z and z. It is easy to see from 
this that any subspace of X with more than one point is disconnected, so 
the components of X are its points. Second, the points of X are not 
open, for any open subset of R which contains a given rational number 
also contains others different from it. Here, then, is a space whose 
components are its points and whose points are not open. This example 
also shows that a space need not be discrete in order that each of its 
points be a component. 


Problems 


1. If Ai, As, ..., An, . . . iS a sequence of connected subspaces of a 
topological space each of which intersects its successor, show that 
U*_, An is connected. 

2. Show that the union of any non-empty class of connected subspaces 
of a topological space each pair of which intersects is connected. 

3. In Theorem 31-E we proved that a product space is connected if its 
coordinate spaces are. Devise a different proof of this fact, based 
on Theorem A, for the case in which there are only two coordinate 
spaces. 

4. Use Theorem A to prove that an arbitrary Banach space B is con- 
nected. (Hint: if x is a vector, show that the set of all scalar multi- 
ples of z is a connected subspace of B.) 

5. Let B bean arbitrary Banach space. A convex set in Bis a non-empty 
subset S with the property that if x and y are in S, then 


z=2+ty—2)=(1-—dzet+ty 


is also in S for every real number tsuch thatO0 <t< 1. Intuitively, a 
convex set is a non-empty set which contains the segment joining any 
pair of its points. Prove that every convex subspace of B is con- 
nected. Prove also that every sphere (open or closed) in B is 
convex, and is therefore connected. 

6. Show that an open subspace of the complex plane is connected = 
every two points in it can be joined by a polygonal line. 

7. Consider the union of two open discs in the complex plane which are 
externally tangent to each other. State whether this subspace of the 
plane is connected or disconnected, and justify your answer. Do 
the same when one disc is open and the other closed, and when both 
are closed. 

8. Consider the following subspace of the Euclidean plane: {(z,y):2 ¥ 0 
and y = sin (1/z)}. Is this connected or disconnected? Why? 
Answer the same questions for the subspace {(z,y):c ~0 and 
y = sin (1/z)} U {(2,y):2 = Oand -1 <y < 1}. 
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33. TOTALLY DISCONNECTED SPACES 


We have seen that a connected space is one for which no disconnec- 
tion is possible. We now consider spaces which have a great many 
disconnections, and which therefore lie, in a manner of speaking, at the 
opposite end of the connectivity spectrum. 

A totally disconnected space is a topological space X in which every 
pair of distinct points can be separated by a disconnection of X. This 
means that for every pair of points z and y in X such that x ¥ y, there 
exists a disconnection X = A \U B withze¢ A and yéB. Such a space 
is evidently a Hausdorff space, and if it has more than one point, it is 
disconnected. Oddly enough, a one-point space is both connected and 
totally disconnected. 

The discrete spaces are the simplest totally disconnected spaces. 
A more interesting example is the space discussed at the end of the 
previous section, that is, the set of all rational numbers considered as a 
subspace of the real line. The set of all irrational numbers is also a 
totally disconnected subspace of the real line, and this is proved in much 
the same way, from the fact that there exists a rational number between 
any two irrationals. The Cantor set is yet another totally disconnected 
subspace of the real line, this time one which is compact. 

Our first theorem should not come as a surprise to anyone. 


Theorem A. The components of a totally disconnected space are its points. 
proor. If X is a totally disconnected space, it suffices to show that 
every subspace Y of X which contains more than one point is discon- 
nected. Let x and y be distinct points in Y, and let X = AU B bea 
disconnection of X with z¢ A andy¢B. It is obvious that 


Y=(YNOA)U(YOB) 
is a disconnection of Y. 


Total disconnectedness is closely related to another interesting 
property. 
Theorem B. Let X be a Hausdorff space. If X has an open base whose 
sets are also closed, then X is totally disconnected. 
pRooF. Let x and y be distinct points in X. Since X is Hausdorff, 
x has a neighborhood G which does not contain y. By our assumption, 
there exists a basic open set B which is also closed such that x ¢€ BC G. 
X = BU B’ is clearly a disconnection of X which separates z and y. 


If the space X in this theorem is also compact, then the implication 
can be reversed, and the two conditions are equivalent. 
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Theorem C. Let X be a compact Hausdorff space. Then X is totally dis- 
connected = it has an open base whose sets are also closed. 

prooF. In view of Theorem B, it suffices to assume that X is totally 
disconnected and to prove that the class of all subsets of X which are both 
open and closed forms an open base. Let x be a point and G an open set 
which contains it. We must produce a set B which is both open and 
closed such that x¢ B CG. We may assume that G is not the full 
space, for if G = X, then we can satisfy our requirement by taking 
B= X. G’ is thus a closed subspace of X, and since X is compact, G’ is 
also compact. By the assumption that X is totally disconnected, for 
each point y in G’ there exists a set H, which is both open and closed and 
contains y but not z. G’ is compact, so there exists some finite class of 
H,’s, which we denote by {H1, H2, . . . , Hn}, with the property that its 
union contains G’ but not z. We define H by H = U2, Hi, and we 
observe that since this is a finite union and all the H;’s are closed as well 
as open, H is both open and closed, it contains G’, and it does not contain 
zx. If we now define B to be H’, then B clearly has the properties required 
of it. 


Totally disconnected spaces are of considerable significance in 
several parts of topology, notably in dimension theory (see Hurewicz and 
Wallman ([21]) and in the classic representation theory for Boolean 
algebras given in Appendix 3.! 


Problems 


1. Prove that the product of any non-empty class of totally discon- 
nected spaces is totally disconnected. 

2. Prove that a totally disconnected compact Hausdorff space is 
homeomorphic to a closed subspace of a product of discrete two-point 
spaces. 


34. LOCALLY CONNECTED SPACES 


In Sec. 23 we encountered the concept of a locally compact space, 
that is, of a space which is compact around each point but need not be 
compact as a whole. We now study another “local” property which a 


1 The reader should be made aware of the fact that several different definitions 
of total disconnectedness are commonly found in the literature. The definition 
given above seems to the present writer to have the logic of language behind it; and 
Theorem C shows that this definition is equivalent (in the case of a compact Hausdorff 
space) to the most important of these alternative definitions. 
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topological space may have, that of being connected in the vicinity of 
each of its points. 

A locally connected space is a topological space with the property that 
if x is any point in it and G any neighborhood of z, then G contains a 
connected neighborhood of x. This is evidently equivalent to the condi- 
tion that each point of the space have an open base whose sets are all 


— pit —- 


Fig. 26. A \U B is connected but not locally connected. 


connected subspaces. Locally connected spaces are quite abundant, for, 
as we have seen in Problem 32-5, every Banach space is locally connected. 

We know that local compactness is implied by compactness. Local 
connectedness, however, neither implies, nor is implied by, connectedness. 
The union of two disjoint open intervals on the real line is a simple 
example of a space which is locally connected but not connected. A space 
can also be connected without being locally connected, as the following 
example shows. Let X be the subspace of the Euclidean plane defined 
by X=AWB, where A = {(z,y):x =0 and —1<y<1} and 
B= {(z,y):0 < x < land y = sin (1/z)} (see Fig. 26). B is the image 
of the interval (0,1] under the continuous mapping f defined by 


F(x) = (@, sin (1/z)), 


so B is connected by Theorem 31-B; and since X = B, X is connected by 
Theorem 32-B. X is not locally connected, however, for it is reasonably 
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easy to see that each point z in A has a neighborhood which does not 
contain any connected neighborhood of z. 

We know by Theorem 32-C that the components of an arbitrary 
topological space X are always closed sets, and from this we see at once 
that the components of any closed subspace of X are also closed in X. 
The reader may feel, with some justification, that the components of a 
well-behaved space ought to be open sets. This is true for locally 
connected spaces. 


Theorem A. Let X be a locally connected space. If Y is an open subspace 
of X, then each component of Y isopenin X. In particular, each component 
of X is open. 

PROOF. Let C be a component of Y. We wish to show that C is open 
in X. Let z be a point in C. Since X is locally connected and Y is 
open in X, Y contains a connected neighborhood G of x. It suffices to 
show that GCC. This will follow at once from the fact that C is a 
component of Y if we can show that G is connected as a subspace of Y. 
But this is clear by Problem 16-6, according to which the topology of 
G as a subspace of Y is the same as its topology as a subspace of X; for 
G is connected with respect to the latter topology. 


The principal applications of local connectedness lie in the theory 
of continuous curves (see Appendix 2). 


Problems 


1. Prove that a topological space X is locally connected if the compo- 
nents of every open subspace of X are open in X. 

2. A connected subspace of a locally connected space X is locally 
connected if X is the real line. Why? Is this true if X is an arbi- 
trary locally connected space? 

3. Show that a compact locally connected space has a finite number of 
components. 

4. Show that the image of a locally connected space under a mapping 
which is both continuous and open is locally connected. 

5. Prove that the product of any non-empty finite class of locally 
connected spaces is locally connected. 

6. Show that the product of an arbitrary non-empty class of locally 
connected spaces can fail to be locally connected. (Hint: consider a 
product of discrete two-point spaces.) 

7. Prove that the product of any non-empty class of connected locally 
connected spaces is locally connected. 


CHAPTER SEVEN 


Approximation 


Our work in the present chapter centers around the famous theorem 
of Weierstrass on the approximation by polynomials of continuous real 
functions defined on closed intervals. This theorem, important as it is 
in classical analysis, has been overshadowed in recent years by a gen- 
eralized form of it discovered by Stone. The latter relates to continuous 
functions defined on compact Hausdorff spaces, and has become an 
indispensable tool in topology and modern analysis. 

We prove the Weierstrass theorem and then the two forms of the 
Stone-Weierstrass theorem which deal separately with real and complex 
functions. Finally, after an excursion into the theory of locally compact 
Hausdorff spaces, we extend the Stone-Weierstrass theorems to this 
context. 


35. THE WEIERSTRASS APPROXIMATION THEOREM 


Let us consider a closed interval [a,b] on the real line and a poly- 
nomial 


p(t) =ao+ayr+-+: +4,2, 


with real coefficients, defined on [a,b].1 Every such polynomial is 
obviously a continuous real function, and as a consequence of the second 
Iemma in Sec. 20, we know that the limit of any uniformly convergent 


1 This polynomial can of course be regarded as a function defined on the entire 
real line. We ignore this fact and consider only z’s which lie in [a,b]. 
153 
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sequence of such polynomials is also a continuous real function. The 
Weierstrass theorem states that the converse of this is also true, that is, 
that any continuous real function defined on [a,b] is the limit of some 
uniformly convergent sequence of polynomials. This is clearly equiv- 
alent to the statement that such a function can be uniformly approxi- 
mated by polynomials to within any given degree of accuracy. Many 
proofs of this classic theorem are known, and the one we give is perhaps 
as concise and elementary as most. 


Theorem A (the Weierstrass Approximation Theorem). Let f be a con- 
tinuous real function defined on a closed interval [a,b], and let « > 0 be 
given. Then there exists a polynomial p with real coefficients such that 
f(z) — p(x)| < € for all x in [a,b]. 
proor. Asa first step, we show that it suffices to prove the theorem for 
the special case in which a = 0 and b=1. If a =b, the conclusion 
follows at once on taking p to be the constant polynomial defined by 
p(x) = f(a). We may thus assume that a <b. We next observe that 
x = [b — ajz’ + a gives a continuous mapping of [0,1] onto [a,b], so that 
the function g defined by g(z’) = f([b — a]z’ + a) is a continuous real 
function defined on [0,1]. If our theorem is proved for the case in which 
a = 0 and b = 1, then there exists a polynomial p’ defined on [0,1] such 
that |g(z’) — p’(z’)| < ¢ for all x’ in [0,1]. If we now express this 
inequality in terms of xz, we obtain |f(x) — p’((z — a]/[b — a])| < ¢ for 
all x in [a,b]; and defining a polynomial p by p(x) = p’({z — a]/[b — a]) 
yields our theorem in the general case. Accordingly, we may assume 
that a = 0 and b = 1. 

We next recall that if x is a positive integer and & an integer such 


that 0 < k < n, then the binomial coefficient () is defined by 


i) = ni/ki(n — kyl. 


The polynomials B,—one for each n—defined by 


n= Retort) 


are called the Bernstein polynomials associated with f. We prove our 
theorem by finding a Bernstein polynomial with the required property. 

Several identities will be needed for this. The first is 9 special case 
of the binomial theorem: 


> () e(1 — z)e* = [x + (1 — 2)J* = 1. (1) 
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If we differentiate (1) with respect to x, we get 


> (2) [kak-1(1 — 2)™-* — (n — k)ak(1 — x)2-# 1] 

zs > 4) 21(1 — 2)--1(k — nz) = 0, 
and multiplying through by z(1 — z) gives 
> (2) z*(1 — x)"*(k — nz) = 0. (2) 


k=0 


On differentiating (2) with respect to z and considering z*(1 — x)"—* as 
one of the two factors in applying the product rule, we get 


> (f) [—nak(1 — x)*-* + oF-(1 — 2)-*-(k — nz)'}] = 0. (3) 


Applying (1) to (3) gives 


> ©) xt—-\(1 — x)"-*-1(k — nz)? = n; 
k=0 


and on multiplying this through by z(1 — 2), we find that 


> (7) aX(1 — 2)**(k — nz)? = na(1 — 2), 
k=0 


or, on dividing both sides by n?, 
. n ke = n—k = k : = al — 2) 
PY Das ae ® 


Identities (1) and (4) will be our main tools in showing that B,(x) is 
uniformly close to f(x) for all sufficiently large n. 
Now for the proof of the fact just stated. By using (1), we see that 


fle) ~ Ba(e) = & (P) ta - aye see) — 7 (2) 


k=0 


so that 


n n = 
Ifa) — Bala] < & (7) ta — 9 


fle) —f (<) |} 


n 


Since f is uniformly continuous on [0,1], we can find a 6 > O such that 
la — k/n| < = |f(z) — f(k/n)| < ¢/2. We now split the sum on the 
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right of (5) into two parts, denoted by = and 2’, where 2 is the sum of 
those terms for which |x — k/n| < 6 (we think of x as fixed but arbi- 
trary) and where 2’ is the sum of the remaining terms. It is easy to see 
that 2 < «/2. We complete the proof by showing that if n is taken 
sufficiently large, then 2’ can be made less than e/2 independently of z. 
Since f is bounded, there exists a positive real number K such that 
\f(z)| < K for all z in [0,1]. From this it follows that 


Y <2K>d 6) zk(1 — x)", 


where the sum on the right—denote it by 2/’—is taken over all k such 
that |z — k/n| > 6. It now suffices to show that if 7 is taken sufficiently 
large, then =” can be made less than ¢/4K independently of x. Identity 
(4) shows that 


32 >” < al ~ 9), 


sO > < ot 2), 


6°'n 
The maximal value of z(1 — z) on [0,1] is 4, so 
” ] 
> Ss 48°n 
If we take n to be any integer greater than K/é%e, then 2” < ¢/4K, 


zr’ < ¢/2, |f(z) — B,(x)| < ¢ for all z in [0,1], and our theorem is fully 
proved. 


The Weierstrass theorem clearly amounts to the assertion that for 
any closed interval [a,b] on the real line, the polynomials are dense in the 
metric space C[a,b]. This is the form of the theorem which we shall 
generalize in the next section to C(X,R), where X is an arbitrary compact 
Hausdorff space. 

The slightly restricted statement that the polynomials are dense in 
@[0,1] has another generalization, in a different direction. This result 
is so remarkable that we state it because of its intrinsic interest, though 
we give no proof. The Weierstrass theorem for @[0,1] says, in effect, 
that all real linear combinations of the functions 


2 
| ae nS ee 


are dense in @[0,1], where by a real linear combination of these functions 
we mean the result of choosing any finite set of them, multiplying each by 
a real number, and adding. Instead of working with all positive powers 
of x, let us permit gaps to occur, and consider the infinite set of functions 


Le a ae Pe 
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the n,’s being positive integers for which m <nz2<°'*> <m<-°: 
The result: we have in mind is called Miniz’s theorem, and asserts that all 
real linear combinations of these functions are dense in @[0,1] <= the 
series 2;_, 1/n, diverges. For a proof, we refer the interested reader to 
Lorentz [29, pp. 46-48] or Achieser [1, pp. 43-46]. 


Problems 


1. Prove that C[a,b] is separable. 
2. Let f be a continuous real function defined on [0,1]. The moments 


of f are the numbers if . f(x)x" dx, wheren = 0,1,2,.... Prove 


that two continuous real functions defined on [0,1] are identical if 
they have the same sequence of moments. 

3. Use the Weierstrass theorem to prove that the polynomials are 
dense in @(X,R) for any closed and bounded subspace X of the real 
line. 


36. THE STONE-WEIERSTRASS THEOREMS 


Our purpose in this section is to lay bare the true nature of the 
Weierstrass approximation theorem. We achieve this end by general- 
izing the theorem in such a manner that its inessential features are 
stripped away. 

Our starting point is the fact that the polynomials are dense in 
C(a,b] for any closed interval [a,b]. We wish to replace [a,b] by an 
arbitrary compact Hausdorff space X and to make a similar statement 
about C(X,R). The most obvious difficulty in this program is that it is 
meaningless to speak of polynomials on X. This obstacle will disappear 
when we take a closer look at what polynomials are. 

Consider the two functions 1 and z defined on [a,b]. The set P of 
all polynomials on [a,b] is identical with the set of all functions which 
can be built from these two by applying the following three operations: 
multiplication, multiplication by real numbers, and addition. P is an 
algebra of real functions on [a,b], for it is closed with respect to these 
three operations. Even more, it is a subalgebra of @{a,b]. We say that 
P is the subalgebra of C[a,b] generated by {1,xz}, for it is a subalgebra 
containing {1,7} which is contained in every subalgebra with this prop- 
erty. We know by Problem 20-3 that the closure of a subalgebra of 
e€(X,R)—for any topological space X—is also a subalgebra of @(X,R). 
We may therefore speak of the closure P of P as the closed subalgebra of 
C[a,b] generated by {1,r}. As above, this means that P is a closed sub- 
algebra containing {1,2} which is contained in every closed subalgebra 
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with this property. These ideas make it possible for us to state the 
Weierstrass theorem in the following equivalent forms: 
(1) the closed subalgebra of C[a,b] generated by {1,7} equals 
[a,b]; 
(2) any closed subalgebra of @[{a,b] which contains {1,27} equals 
@[a,b]. 
These are potent statements, saying, as they do, that the very small set of 
functions {1,2} suffices to generate the much more extensive set, C[a,b]. 
As our theorems below will show, these statements depend only on the 
fact that a closed subalgebra of @[a,b] which contains the set {1,x} 
separates points in the sense of Sec. 27 (for it contains the function z) 
and contains all constant functions (for it contains the non-zero constant 
function 1). 

Before we go further, it is worth observing that statement (1) is not 
true in general if cither 1 or x is omitted from the generating set. If zis 
omitted, then the closed subalgebra generated by {1} consists of the 
constant functions, and this is not equal to C[a,b] unless a = b. On the 
other hand, if 1 is omitted, then the closed subalgebra generated by {2} 
contains only functions which vanish at 0, and if 0 is in [a,b], then the non- 
zero constant functions, among others, are not in this closed subalgebra. 

Our theorems rest on two lemmas, both of which have to do with the 
fact that C(X,R) is a lattice for any topological space X. If f and g are 
functions in @(X,R), we recall that their join and meet are defined by 


(fv g)(x) = max {f(x),9(x)} 
and (f a g)(z) = min {f(z),g(z)}. 


Our first lemma states conditions which guarantee that a closed sub- 
lattice of C(X,R) equals C(X,R). 


Lemma. Let X be a compact Hausdorff space with more than one point, 
and let L be a closed sublattice of C(X,R) with the following property: if 
xand y are distinct points of X and aand b any two real numbers, then there 
exists a function f in L such that f(x) = a and f(y) = b. Then L equals 
C(X,R).} 

proor. Let f be an arbitrary function in C(X,R). We must show that 
fisinL. Lete > Obegiven. Since L is closed, it suffices to construct a 
function g in L such that f(z) — e < g(z) < f(z) + for all points z in 


11If X has only one point, then a single constant function constitutes a closed 
sublattice of C(X,R) with the stated property which does not equal e(X,R). It is 
therefore necessary to assume that X has more than one point. Further, the reader 
will notice that the proof given below makes no use of the assumption that X is 
Hausdorff. However, if there exists a closed sublattice of €(X,R) with the stated 
property, then X is necessarily Hausdorff, so there is nothing to be gained by omitting 
this assumption. 
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X, for it will follow from this that ||f — g|| <«. We now construct such 
a function. 

Let x be a point in X which is fixed throughout this paragraph, and 
let y be a point different from z. By our assumption about L, there 
exists a function f, in L such that f,(z) = f(x) and f,(y) = f(y). Now 
consider the open set G, = {z:f,(z) < f(z) +e}. It is clear that both 
x and y belong to G,, so the class of G,’s for all points y different from 
x is an open cover of X. Since X is compact, this open cover has a 
finite subcover, which we denote by {Gi, Gz, ..., Gn}. If the corre- 
sponding functions in L are denoted by fi, fo, . . . , fr, then 


gz =firfer pee Afn 


is evidently a function in Z such that g.(x) = f(x) and g.(z) < f(z) + « 
for all points z in X. 

We next consider the open set H, = {z:g:(z) > f(z) — e}. Since 
x belongs to H,, the class of H,’s for all points z in X is an open cover of 
X. The compactness of X implies that this open cover has a finite sub- 
cover, which we denote by {H,, He, ...,Hm}. We denote the corre- 
sponding functions in L by gi, go, ...-, gm, and we define g by 
9=91V92V°** VQm. It is clear that g is a function in L with the 
property that f(z) — « < g(z) < f(z) + € for all points z in X, so our 
proof is complete. 


In our next lemma we make use of the concept of the absolute value 
of afunction. If f is a real or complex function defined on a topological 
space X, then the function |f|—called the absolute value of f—is defined by 
lfl(z) = |f(z)|. If f is continuous, then |f| is also continuous. We 
observe that the lattice operations in C(X,R) are expressible in terms of 
addition, scalar multiplication, and the formation of absolute values: 


These identities show that any linear subspace of C(X,2) which contains 
the absolute value of each of its functions is a sublattice of C(X,R). 


Lemma. Let X be an arbitrary topological space. Then every closed sub- 
algebra of C(X,R) is also a closed sublattice of C(X,R). 

pRooF. Let A be aclosed subalgebra of C(X,R). By the above remarks, 
it suffices to show that if f is in A, then |f] is also in A. Let « > 0 be 
given. Since |¢| is a continuous function of the real variable t, by the 
Weierstrass approximation theorem there exists a polynomial p’ with 
the property that | |é| — p’(é| < «/2 for every ¢ on the closed interval 
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(—Ifl Ifill. If » is the polynomial which results when the constant 
term of p’ is replaced by 0, then p is a polynomial with 0 as its constant 
term which has the property that | |é) — p(t)] <e for every ¢ in 
[—IlFfIL llfll]. Since A is an algebra, the function p(f) in C(X,R) is in A. 
By the stated property of p, it is easy to see that | |f(z)| — p(f(x))| <e 
for every point xz in X, and from this it follows that || |f] — p(f)|| <«. 
We conclude the proof by remarking that since A is closed, the fact that 
|f| can be approximated by the function p(f) in A shows that |f| is in A. 


We are now in a position to prove the Stone-Weierstrass theorems. 


Theorem A (the Real Stone-Weierstrass Theorem). Let X be a compact 
Hausdorff space, and let A be a closed subalgebra of C(X,R) which separates 
points and contains a non-zero constant function. Then A equals C(X,R). 
proor. If X has only one point, then @(X,R) contains only constant 
functions; and since A contains a non-zero constant function and is an 
algebra, it contains all constant functions and equals C(X,R). We may 
thus assume that X has more than one point. By the above lemmas, it 
suffices to show that if x and y are distinct points of X, and if a and 6 are 
any two real numbers, then there exists a function f in A such that 
f(z) = aand f(y) = b. Since A separates points, there exists a function 
gin A such that g(x) ¥ g(y). If we now define f by 


_ _ g(z) — gly) g(z) — g(x) 
He) = 9 oy = giyy + © ga) = aay’ 


then f clearly has the required properties. 


We next turn our attention to the complex case, that is, to conditions 
which guarantee that a closed subalgebra of €(X,C) equals @(X,C). 
It is first of all necessary to understand that the conditions of Theorem A 
will not suffice. The simplest example which shows this requires a little 
knowledge of the theory of analytic functions, and the reader without 
such knowledge may skip at once to the next paragraph. Let X be the 
closed unit disc {z:|z| < 1} in the complex plane. X is clearly a compact 
Hausdorff space. Consider the set A of all functions in C(X,C) which 
are analytic in the interior of X. A is evidently a subalgebra of C(X,C), 
and one sees that it is closed by using Morera’s theorem. A separates 
points, for it contains the function f defined by f(z) = z. It also contains 
all constant functions. In spite of this, A does not equal C(X,C); for 
the function g defined by g(z) = Z is in C(X,C) but is not in A, since it is 
not differentiable at any point. 

What can be done to salvage Theorem A in the complex case? 
The answer lies in the operation of conjugation discussed at the end of Sec. 
20. If f is a complex function defined on a topological space X, we 
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remind the reader that its conjugate f is defined by f(x) = f(z). It will 
also be convenient for us to define the real part and the imaginary part of f: 


rn =F! ana ny = FoF (1) 


We observe that if a complex function f has different values at two distinct 
points of X, then at least one of the functions R(f) and I(f) also has 
different values at these points. 


Theorem B (the Complex Stone-Weierstrass Theorem). Let X be a 
compact Hausdorff space, and let A be a closed subalgebra of C(X,C) which 
separates points, contains a non-zero constant function, and contains the 
conjugate of each of its functions. Then A equals C(X,C). 

prRooF. The real functions in A clearly form a closed subalgebra B of 
e@(X,R). Let us assume for a moment that B equals C(X,R). If fis an 
arbitrary function in C(X,C), then R(f) and I(f) are in C(X,R), and are 
thusin B. But since f = R(f) + <I(f) and A is an algebra, f isin A and 
A equals €(X,C). It therefore suffices to show that B equals C(X,R). 
We prove this by applying Theorem A. 

We begin by showing that B separates points. Let 2 and y be 
distinct points in X. Since A separates points, there exists a function 
f in A which has different values at z and y. As we saw in the above 
remarks, R(f) or I(f) also has different values at x and y. Since A is an 
algebra which contains the conjugate of each of its functions, formulas (1) 
show that both R(f) and I(f) are in B, so B separates points. We next 
show that B contains a non-zero constant function. By our hypothesis, 
A contains some non-zero constant function g. A is an algebra which 
contains the conjugate of each of its functions, so gf = |g|? is a non-zero 
constant function in B. Theorem A now implies directly that B equals 
@(X,R), and our proof is complete. 


The two Stone-Weierstrass theorems are among the most important 
facts in modern analysis. The theory developed in the last three chapters 
of this book could hardly exist without them, and they have many other 
applications as well.! 


Problems 


1. Prove the two-variable Weierstrass approximation theorem: if 
f(z,y) is a real function defined and continuous on the closed rectangle 
X = [a,b] X [c,d] in the Euclidean plane R?, then f can be uniformly 
approximated on X by polynomials in z and y with real coefficients. 


1See Stone [40]. 
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2. Let X be the closed unit disc in the complex plane, and show that any 
function f in C(X,C) can be uniformly approximated on X by poly- 
nomials in z and Z with complex coefficients. 

3. Let X and Y be compact Hausdorff spaces, and f a function in 
e(X X Y,C). Show that f can be uniformly approximated by 
functions of the form 27, f.gi, where the f;’s are in C(X,C) and the 
g;’s are in C(Y,C). 


37. LOCALLY COMPACT HAUSDORFF SPACES 


In Sec. 23 we defined a locally compact space to be a topological space 
in which each point has a neighborhood with compact closure. Locally 
compact spaces often arise in the applications of topology to geometry 
and analysis, and since those which do are almost always Hausdorff 
spaces, we restrict our attention in this section to locally compact 
Hausdorff spaces. 

The main fact about such a space is that it can be converted into a 
compact Hausdorff space by suitably adjoining a single point. The 
reader is perhaps familiar from analysis with the prototype of this process, 
in which the complex plane C is enlarged by adjoining to it an “ideal 
point” called the point at infinity and denoted by ©. This ideal point 
can be thought of as any object not in C, and we denote by C.. the larger 
set CU {o}. C. is called the ez- 
tended complex plane when the neigh- 
borhoods of ~ (other than C., itself) 
are taken to be the complements in 
C. of the closed and bounded subsets 
(i.e., the compact subspaces) of C. 
These ideas add nothing to our un- 
derstanding of the complex plane, but 
they do clarify many proofs and sim- 
plify the statements of many theo- 
rems, and they are valuable for this 
reason. Figure 27 gives an easy way 
of visualizing the extended complex 
plane. In this figure, the surface S 

Fig. 27. The Riemann sphcre. of a sphere of radius 14 is rested tan- 
gentially on C at the origin. It is 

customary to call the point of contact the south pole and the opposite 
point the north pole. The indicated projection from the north pole 
establishes a homeomorphism between S minus its north pole and C, so 
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from the topological point of view, S minus its north pole can be regarded 
as essentially identical with the complex plane C. The north pole of S 
can he considered to be the point at infinity, and passing from C to 
C. amounts to using the point © to plug up the hole in C at the north 
pole. When S is identified in this manner with the extended complex 
plane, it is usually called the Riemann sphere. In summary, the locally 
compact Hausdorff space C has been made into the compact Hausdorff 
space S by adding the single point «. 

We now generalize the construction outlined above to the case of an 
arbitrary locally compact Hausdorff space X. Let © be an object not 
in X, and form the set X, = XU {oo}. Wedefine a topology on X. by 
specifying the following as open sets: (i) the open subsets of X, regarded 
as subsets of X.; (ii) the complements in X., of the compact subspaces 
of X; and (iii) the full space X.. If we keep in mind the fact that a 
compact subspace of a Hausdorff space is closed, then it is easy to show 
that this class of sets actually is a topology on X.., and also that the given 
topology on X equals its relative topology as a subspace of X.. The 
following are the main properties of the topological space Xw. 

(1) X., is compact. To prove this, let {G,;} be an open cover of Xx. 
We must produce a finite subcover. If X. occurs among the G,’s, then 
{G;} clearly has a finite subcover, namely, {X.}. We may therefore 
assume that each G; is a set of type (i) or type (ii). At least one G;, say 
G;,, must contain the point ©, and this set is necessarily of type (ii). 
Its complement G;,' is thus a compact subspace of X which is contained in 
the union of some class of open subsets of X of the form G; \ X, so it is 
contained in the union of some finite subclass of these sets, say {Gi 
X, G2A1VX,..., Gat X}. It is now easy to see that the class 
{G.,, Gi, Go, . . . , Gr} is a finite subcover of the original open cover of 
Xo, 80 Xo is compact. 

(2) X. is Hausdorff. X is Hausdorff, so any pair of distinct points 
in X., both of which lie in X can be separated by open subsets of X, and 
thus can be separated by open subsets of X. of type (i). It therefore 
suffices to show that any point xin X and the point © can be separated by 
open subsets of X.. X is locally compact, so z has a neighborhood G 
whose closure G in X is compact. It is now clear that @ and G’ are 
disjoint open subsets of X.. such that z € Gand » € G’, so X. is Hausdorff. 

The compact Hausdorff space X.. associated with the locally compact 
Hausdorff space X in the manner described above is called the one-point 
compactification of X, and the point © is called the point at infinity. 
We know that compact spaces are locally compact, so these ideas apply 
without change when X is a compact Hausdorff space. It is easy to see 
that the locally compact Hausdorff space X is compact = ~ is an isolated 
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point of X.. It may seem useless to consider the one-point compactifica- 
tion of a compact Hausdorff space, but we shall see in the next section 
that it enables us to weaken the hypotheses of the Stone-Weierstrass 
theorems. 

The one-point compactification is useful mainly in simplifying the 
proofs of theorems about locally compact Hausdorff spaces. As an 
example, any space X of this kind is easily seen to be completely regular; 
for X is a subspace of X., which is compact Hausdorff and therefore 
completely regular, and every subspace of a completely regular space is 
completely regular. Accordingly, if z is a point of X, and G a neighbor- 
hood of x which does not equal the full space, then there exists a con- 
tinuous real function f defined on X, all of whose values lie in the closed 
unit interval [0,1], such that f(z) = 1 and f(@’) = 0. This fact can 
easily be generalized, again by using the one-point compactification, to 
the case in which the point z is replaced by an arbitrary compact sub- 
space of X. 


Theorem A. Let X be a locally compact Hausdorff space, let C be a com- 
pact subspace of X, and let G be a neighborhood of C which does not equal 
the full space. Then there exists a continuous real function f defined on X, 
all of whose values lie in the closed untt interval [0,1], such that f(C) = 1 
and f(G’) = 0. 

proor. Let X.,. be the one-point compactification of X. Then C and 
G’ are disjoint closed subspaces of X., and by Urysohn’s lemma there 
exists a continuous real function g defined on X.q, all of whose values lie 
in [0,1], such that g(C) = 1 and g(@’) = 0. If f is the restriction of g to 
X, then f has the required properties. 


This result is an important tool in the theory of measure and inte- 
gration on locally compact Hausdorff spaces. 


Problems 


1. Let X be a locally compact Hausdorff space, and Ci and C; disjoint 
compact subspaces of X. Show that C; and C2 have disjoint neigh- 
borhoods whose closures are compact. 

2. Show that a Hausdorff space is locally compact = each of its points 
is an interior point of some compact subspace. 

3. Let f be a mapping of a locally compact space X onto a Hausdorff 
space Y. If f is both continuous and open, show that Y is also 
locally compact. 

4. Show that if the product of a non-empty class of Hausdorff spaces is 
locally compact, then each coordinate space is also locally compact. 
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38. THE EXTENDED STONE-WEIERSTRASS THEOREMS 


Let X be a locally compact Hausdorff space. Our present purpose is 
to generalize the theorems of Sec. 36 to this context. 

A real or complex function f defined on X is said to vanish at infinity 
if for each « > 0 there exists a compact subspace C of X such that 
|f(z)| < « for every x outside of C. On the real line, for instance, the 
functions f and g defined by f(x) = e-** and g(x) = (x? + 1)—! have this 
property, but the non-zero constant functions do not. It is easy to see 
that if X is compact, then every real or complex function defined on X 
vanishes at infinity, so in this case the requirement that a function vanish 
at infinity is no restriction at all. 

We denote by @,(X,R) the set of all continuous real functions defined 
on X which vanish at infinity. @,(X,C) is defined similarly. If f is a 
function in one of these sets, then since |f(x)| < «outside of some compact 
subspace C' of X, and f is bounded on C, f is necessarily bounded on all of 
X. It follows from this that C)(X,R) C C(X,R) and @,(X,C) C e(X,C). 
Further, the remark in the preceding paragraph shows that when X is 
compact we have equality in each case. 


Lemma. @o(X,R) and @o(X,C) are closed subalgebras of C(X,R) and 
e@(X,C). 

pRooF. We first show that @)(X,R) is a closed subset of C(X,R). It 
suffices to show that if f is a function in C@(X,R) which is in the closure of 
@.(X,R), then f vanishes at infinity. Let « > 0 be given. Since f is in 
the closure of @)(X,R), there exists a function g in @)(X,R) such that 
lf — gl] < €/2, and this implies that |f(z) — g(z)| < «/2 for all x. The 
function g vanishes at infinity, so there exists a compact subspace C of 
X such that |g(z)| < ¢/2 for all z outside of C. It now follows at once 
that 


f(@)| = IIF@) — g(@)] + 9@)| < Ife) — g(@)| + lg@)| < €/2 + ¢/2 =e 


for all x outside of C, so f vanishes at infinity. The same argument shows 
that Co(X,C) is a closed subset of €(X,C). 

We next show that if f and g are in @,(X,R), then f + g is also in 
€o(X,R), that is, that f + g vanishes at infinity. Let « > 0 be given. 
Since f vanishes at infinity, there exists a compact subspace C, of X 
outside of which | f(z)| < ¢/2. Similarly, there exists a compact subspace 
C, of X outside of which |g(x)| < «/2. C = C, UC, is then a compact 
subspace of X outside of which 


If + 9)(z)| = lf) + 9@)! < 1f@)| + l9@)] < €/2 + €/2 = «, 


so f + g vanishes at infinity. We can prove in much the same way that 
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@o(X,R) is also closed with respect to scalar multiplication and multi- 
plication; and since €.(X,R) is non-empty (it contains the function which 
is identically zero), it is clearly a subalgebra of C(X,R). Similarly, 
Co(X,C) is a subalgebra of C(X,C). 


This lemma permits us to regard C.(X,R) and @o(X,C) as algebras of 
functions in their own right. We next establish a natural and useful 
connection between continuous functions defined on X which vanish at 
infinity and continuous functions defined on X. which vanish at the 
point ©, where of course X,, is the one-point compactification of X. 
It is important here to be quite clear about the distinction between these 
concepts. Fora function on X to vanish at infinity means precisely what 
the above definition says. Such a function need not have 0 as a value. 
On the other hand, to say that a function on X,, vanishes at the point © 
is to say that this function assumes the value 0 at the point ©. 


Lemma. @)(X,R) equals the set of all restrictions to X of those functions 
in C(X.o,R) which vanish at the point ©. Similarly, Co(X,C) equals the 
set of all restrictions to X of the functions in C(X.,C) which vanish at the 
point ©, 

pRooF. Let g be a function in C(X.,R) which vanishes at the point ©. 
Since g is continuous at ©, for each e > 0 there exists a neighborhood 
G of © such that |g(x)| < efor allzinG. By the definition of a neigh- 
borhood of © given in Sec. 37, G is the full space X,, or the complement 
in X., of a compact subspace of X. In either case, there clearly exists a 
compact subspace C of X such that |g(x)| < ¢ for every point z in X and 
outside of C. In other words, the restriction f of g to X vanishes at 
infinity, so is a function in @)(X,R). We must also show, conversely, 
that every function f in Co(X,R) arises in this way from some function 
g in C(X..,R) which vanishes at the point ©. All that is necessary is to 
define g by g(x) = f(x) for every x in X and g(~) = 0, and to observe 
that the condition that f vanishes at infinity is precisely what is needed 
to guarantee that g is continuous at «. The proof of the second state- 
ment of the lemma is exactly the same. 


The machinery given above is intended to make the proofs of the 
following two theorems relatively simple. They are called the extended 
Stone-Weierstrass theorems. 


Theorem A. Let X be a locally compact Hausdorff space, and let A be a 
closed subalgebra of Co(X,R) which separates points and for each point in 
X contains a function which does not vanish there. Then A equals Cy(X,R). 
prooF. Let X. be the one-point compactification of X. By our second 
lemma, we can extend every function in A to a function in C(Xw,f) which 
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vanishes at «. We denote the set of all these extensions by Ao. Our 
hypotheses imply that Ao is a closed subalgebra of C(X.,R) which sepa- 
rates points and has the property that all its functions vanish at ©. Let 
A, be the set of functions obtained by adding all constant functions to 
each function in Ao. It is easy to see that A, is a closed subalgebra of 
@(X.,) which separates points and contains a non-zero constant func- 
tion, so by Theorem 36-A, A, equals C(X.., 2). It follows from this that 
Ay consists of all functions in C(X..,R) which vanish at ©, and another 
application of our second lemma shows that A equals C)(X,2). 


Theorem B. Let X be a locally compact Hausdorff space, and let A be a 
closed subalgebra of Co(X,C) which separates points, for each point in X 
contains a function which does not vanish there, and contains the conjugate of 
each of its functions. Then A equals Co(X,C). 

proor. The proof of Theorem A will serve here almost word for word. 


We observe that when X is assumed to be compact in the above two 
theorems, so that @o(X,R) = C(X,R) and @)(X,C) = C(X,C), then 
they constitute slightly stronger forms of the Stone-Weierstrass theorems, 
for they yield the same conclusions under slightly weaker assumptions. 


Problems 


1. If X is a locally compact Hausdorff space, prove that ©)(X,R) is a 
sublattice of C(X,R). 

2. Let X bea locally compact Hausdorff space, and show that the weak 
topology generated by @,(X,R) equals the given topology. 

3. Let X bea locally compact Hausdorff space and S a subset of Co(X,R) 
which separates points and for each point in X contains a function 
which does not vanish there. Show that the weak topology gen- 
erated by S equals the given topology. 


PART TWO 


Operators 


CHAPTER EIGHT 


Algebraic Systems 


We have seen in the preceding chapters that one of the basic aims of 
topology and modern analysis is the study of the bounded real and com- 
plex functions which are defined and continuous on a topological space X. 
These functions can of course be studied individually, but this doesn’t 
carry us very far. It is desirable to consider the sets C(X,R, and C(X,C) 
of all such functions as mathematical systems with a high level of internal 
organization, and this program compels us to give serious attention to 
their structural features. It is at this point that algebra enters the 
picture; for modern algebra is essentially the result of crystallizing into 
abstract form, and studying for their own sake, a few simple patterns of 
structure which underlie many diverse parts of mathematics. 

In Secs. 14 and 20 we defined what is meant by a linear space and 
an algebra, but we did not develop the theory of these systems to any 
appreciable degree. We used them only descriptively, as a convenient 
means of calling attention to the fact that the points in the spaces R* and 
C* can be added and multiplied by numbers, and those in C(X,R) and 
€(X,C) can be multiplied together as well. Our work in the rest of this 
book requires a deeper understanding of these systems and several 
others, and the purpose of this chapter is to provide a concise but reason- 
ably complete exposition of this necessary background material. 

The algebraic systems we discuss below—groups, rings, linear spaces, 
and algebras—have been the subject of many books and innumerable 
research articles. In the few pages we devote to each, we clearly can do 
little more than explain what each system is, mention several outstanding 


examples, and develop the theory to the limited extent required by our 
71 
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later work. If the reader finds it desirable to amplify our abbreviated 
treatment by consulting additional sources, we suggest McCoy [31] and 
Halmos [17]. 


39. GROUPS 


We begin by considering two familiar algebraic systems, each of 
which is a group, with a view to pointing out those features common to 
both which are set forth abstractly in the general concept of a group. 

We first observe that the set F of all real numbers, together with the 
operation of ordinary addition, has the following properties: the sum 
of any two numbers in F is a number in R (R is closed under addition) ; 
if x, y, 2 are any three numbers in R, then x + (y +z) = (cx ty) +2 
(addition is associative); there is present in R a special number, namely 0, 
with the property that x + 0 = 0+ 2 = z for every z in R (RF contains 
an additive identity element); and to each number z in R# there corre- 
sponds another number in R, its negative —z, with the property that 
x + (—2) = (—2) + x = 0 (R contains additive inverses). 

It is equally clear that the set P of all positive real numbers, together 
with the operation of ordinary multiplication, has the following corre- 
sponding properties: the product of any two numbers in P is a number 
in P (P is closed under multiplication) ; if x, y, z are any three numbers in 
P, then z(yz) = (xy)z (multiplication is associative); there is present in 
P a special number, namely 1, with the property that 71 = lx = 2 for 
every x in P (P contains a multiplicative identity element); and to each 
number z in P there corresponds another number in P, its reciprocal 
1/x = x—', with the property that zz-! = 2-4 = 1 (P contains multi- 
plicative inverses). 

Each of these systems plainly possesses many properties other than 
those we have mentioned. We ignore all such properties and concentrate 
our attention solely on the ones we have listed. Let us now consciously 
disregard the concrete nature of the elements composing the above sets 
and the familiar character of the algebraic operations involved. What 
remains in each case is a non-empty set which is closed under an operation 
possessing certain formal properties, and apart from notation and 
terminology, these properties are identical in the two systems. The 
concept of a group is a distillation of the common structural form of 
these and many other similar systems. 

The definition is as follows. A group is a non-empty set G together 
with an operation (called multiplication) which associates with each 
ordered pair z, y of elements in G a third element in G (called their 
product and written xy) in such a manner that 
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(1) multiplication is associative, that is, if z, y, z are any three 
elements in G, then x(yz) = (xy)z; 

(2) there exists an element e in G, called the identity element (or 
simply the identity), with the property that ze = ex = x for 
every x in G; and 

(3) to each element z in G there corresponds another element in G, 
called the inverse of z and written 2—!, with the property that 
wot=a2z=e. 

It should be carefully. noted that we do not assume that zy = yz for all 
elements x and y. A group which satisfies this additional condition is 
called a commutative group or an Abelian group (after the Norwegian 
mathematician Abel). If G consists of a finite number of elements, then 
it is called a finite group and this number is called its order; otherwise, it 
is called an infinite group. 

In axiom (2) we speak of the identity, as if there were only one 
identity element in G. This is indeed the case, for if e’ is also an element 
in G such that xe’ = e’x = x for every z, then e’ = e’e = e shows that 
e’ necessarily equals e. Similarly, in axiom (3) we speak of the inverse of 
x, as if each element had only one inverse. This also is true, for if x’ is 
another element in G such that zz’ = 2'x = e, then 


U 


a’ = a'e = 2'(xa-) = (2'z)a-! = ex-! = 


shows that x’ equals z—}. 

If we have succeeded in disengaging ourselves from our intuitive 
ideas, we must admit that we know nothing whatever about the actual 
nature of either the set G or the operation. Both are completely abstract, 
and it is essential to understand that our knowledge of G and its operation 
is strictly confined to the information contained in the above axioms.! 
As our examples below will show, the elements of a group need not be 
numbers at all, and its operation can perfectly well be some bizarre rule 
of combination which bears no relation to the usual operations of ele- 
mentary algebra. In its essence, the study of groups is the study of a 
single algebraic operation in its purest form, and the theory of groups is 
the body of theorems—together with their applications—which can be 
deduced from the given axioms. This theory is richer in content than 
can easily be imagined by anyone who has not delved into it for himself, 
and the applications reach throughout mathematics and even beyond, 
into such strikingly diverse fields as geometry, the theory of the solva- 


1Some writers emphasize this by referring to G as an abstract group. This 
concept is then contrasted with that of a concrete group, such as the group of all real 
numbers with addition. The abstract character of an abstract group is sometimes 
further emphasized by using a noncommittal symbol like x * y in place of zy and by 
speaking of the star operation, or the group operation, instead of multiplication, 
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bility of algebraic equations, crystallography, quantum theory, and the 
theory of relativity. 


Example 1. We have seen that the real numbers form a group with 
respect to addition. In the present example we place this group in a 
context of several groups of the same type, that is, groups whose ele- 
ments are numbers, whose operation is ordinary addition, and in which 
the identity is 0 and the inverse of a number is its negative. 

(a) The single-element set consisting only of the number 0. 

(b) The set J of all integers. Observe that the set of all non- 
negative integers meets every requirement for a group except axiom (3), 
so it is not a group. 

(c) The set of all even integers. The odd integers do not con- 
stitute a group, for the sum of two odd integers is even. 

(d) The set of all rational numbers. 

(e) The set FR of all real numbers. 

(f) The set C of all complex numbers. 

(g) The set of all complex numbers whose real and imaginary parts 
are both integers. 


Example 2. We also saw at the beginning of this section that the posi- 
tive real numbers form a group with respect to multiplication. Again, 
this group is only one among many of a similar kind, some of which are 
listed below. In all these the elements are numbers, the operation is 
ordinary multiplication, the identity is 1, and the inverse of a number is 
its reciprocal. 

(a) The single-element set consisting only of the number 1. 

(b) The two-element set {1,—1}. 

(c) The set of all positive rational numbers. 

(d) ‘The set of all positive real numbers. Observe that the negative 
real numbers do not form a group, for this set is not closed under 
multiplication. 

(e) The set of all non-zero real numbers. The set FR of all real 
numbers contains a number, namely 0, which has no reciprocal, so it is 
not a group with respect to the operation considered here. 

(f) The set of all non-zero complex numbers. 

(g) The set {1,7, —1, —2} of the four fourth roots of unity. 

(h) The set {z:z2” = 1} of the n nth roots of unity for a fixed but 
arbitrary positive integer n. Groups (a), (b), and (g) are the special 
cases which correspond to choosing n equal to 1,2,and 4. Wesee by this 
that there exists a finite group of order n for each positive integer n. 

(¢) The unit circle {z:|z| = 1} in the complex plane. This is called 
the circle group. 


Algebraic Systems 175 


The groups in the next two examples are closely related to our work 
in the previous chapters. They differ from the groups described above 
in that their elements are not numbers. 


Example 3. (a) The set F* of all n-tuples of real numbers, the opera- 
tion being coordinatewise addition. The identity here is 


0=(0,0,...,90), 
and the inverse of the element x = (21, 22, . . . » Zn) i8 
—x = (—2%1, —%, . ~~. , —2n). 


(b) The set C* of all n-tuples of complex numbers with respect to 
coordinatewise addition. 


Example 4. (a) The set C(X,R) of all bounded continuous real func- 
tions defined on a topological space X. The operation here is pointwise 
addition, the identity is the function which is identically zero, and the 
inverse of a function f is the function —f defined by (—f)(z) = —f(z). 

(b) The set C(X,C) of all bounded continuous complex functions 
defined on a topological space X, the operation again being pointwise 
addition. 


The following examples are somewhat miscellaneous in character. 
They should be illuminating to the reader who is not already familiar 
with these ideas, for several have nothing whatever to do with numbers. 


Example 5. (a) The class of all subsets of a set U, the operation being 
the formation of symmetric differences. The reader will recall that the 
symmetric difference of two sets A and B is defined by 


AAB=(A-—B)U(B- A); 


and in Problem 2-3 it was shown that this operation is associative, that 
the identity is the empty set §, and that the inverse of a set is the set 
itself. It is interesting to note that if U is non-empty, then this class 
of sets does not constitute a group with respect to the formation of either 
unions or intersections. 

(b) Any ring of subsets of a set U (see Problem 2-4), the operation 
again being the formation of symmetric differences. 


Example 6. Let m be a positive integer and define J,, to be the set of all 
non-negative integers less than m:J, = {0,1,...,m-— 1}. Ifaandb 
are two numbers in J,,, we define their ‘‘sum” a + b to be the remainder 
obtained when their ordinary sum is divided by m. If mis 7, for instance, 
then I; = {0, 1, 2, 3, 4, 5, 6} and we have 2+ 3 = 5,5 +4 2 = 0, and 
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4+5=2. Figure 28 is a complete addition table for I: to find the 
sum of any two numbers in the set, look for the first number down the 
left-hand side, look for the second across the top, and observe their sum 

in the corresponding place within the table. 


Example 7. The set of all one-to-one 
mappings of a non-empty set X onto itself. 
The operation here is the multiplication of 
mappings defined at the end of Sec. 3: if f 
and g are two such mappings and z is an 
arbitrary element in X, then (fg)(z) = 
f(g(z)). The fact that this system forms 
a group was shown in Problems 3-1, 3-2, 
and 3-5. 


5 
5 
6 
0) 
1 
2 
3 
4 


Fig. 28. The addition table 


for I7. Example 8. Consider the special case of 
the previous example in which the set X is 
taken to be a finite set with » elements, e.g., the set {1,2,..., n}. 


A one-to-one mapping of this set onto itself is usually called a permuta- 
tion, for it can be regarded as a rearrangement of the elements of the set. 
If n is 4, for instance, the permutation which sends 1 to 3, 2 to 1, 3 to 4, 
and 4 to 2 can be written in the convenient form 


where below each of the integers 1, 2, 3, 4 is placed its image under the 
mapping p. If 


is another such permutation, then their product pq (first g, then p) takes 
1 to 4 then 4 to 2, 2 to 1 then 1 to 3, 3 to 3 then 3 to 4, and 4 to 2 then 
2to1. This result can be written 


_ (1234), 
Pd = \ 9341 


The group of all permutations of n elements is denoted by S, and called 
the symmetric group of degree n. The detailed structure of symmetric 
groups is of fundamental importance in the theory of the solvability of 
algebraic equations. 


Example 9. Our final example is the group of symmeiries of a square. 
Imagine that Fig. 29 represents a cardboard square placed on a plane 
with fixed axes in such a way that its center is at the origin and its sides 
are parallel to the axes. This square is carried onto itself by the follow- 
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ing rigid motions: the identity motion I, which leaves fixed each point 
of the square; the counterclockwise rotations R, R’, and R” about the 
center through angles of 90, 180, and 270 degrees; the reflections H and V 
about the horizontal and vertical axes; and the reflections D and D’ about 
the indicated diagonals. Each of these rigid motions is related to a 
certain aspect of the symmetry of the square, and they are therefore 
called symmetries. We multiply two symmetries by performing them 
in succession, beginning with the one on 
the right. Accordingly, RV is the result 
of first reflecting the square about the 
vertical axis, then rotating it counter- 
clockwise through 90 degrees. If we 
trace the effect of these motions by 
following the manner in which the num- 
bered vertices are shifted about, we see 
that RV has the same result as D, so 
RV =D. These eight symmetries, to- ae 
gether with the operation we have de- p’ /3 
scribed, are easily seen to form a group. “75 
Associativity is a special case of Prob- 

lem 3-1; J is evidently the identity; and 

it is clear that H, V, D, and D’ aretheir Fig. 29. The symmetries of a 
own inverses and that R-'= R”, square. 

R’-! = R’, and R”’-1= R. In much 

the same way, we can define the group of symmetries of an isosceles 
triangle, an equilateral triangle, a rectangle, a regular pentagon, etc., 
and in each case the group describes in a precise fashion the “symmetry 
characteristics” of the figure. We can go even further and consider 
the group of symmetries of a regular solid in ordinary three-dimensional 
space. Groups of this kind have interesting and important appli- 
cations in crystallography. 


We now return briefly to the consideration of a general group G. 
One of the more elementary facts about G is that certain simple equations 
are always solvable. 

(4) If a@ and b are any two elements of G, then the equations 

az = b and ya = b have solutions x and y in G. 
To prove (4), we have only to observe that z = a~1¥ and y = ba! are 
in fact solutions, since a(a-'b) = (aa~")b = eb = b and 


(ba-")a = b(a-"a) = be = b. 


Not only are the equations in (4) solvable in G, but their solutions are 
unique. This is a direct consequence of the following cancellation law. 
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(5) Ifa is any element in G, then az = az’ > x = x’ and 
ya=yasy=y’. 

We prove the first of these statements by multiplying az = az’ by 
a~! on the left. This gives a—(ax) = a—'(az’), from which we get 
(a“a)xz = (a—a)2’, ex = ex’, andx = 2’. Thesecond is proved similarly. 

It is sometimes useful to know that (4) is capable of replacing axioms 
(2) and (8) in the definition of a group. This amounts to the assertion 
that if G is a non-empty set which is closed under an associative multi- 
plication with property (4), then G is a group. To prove this, we must 
show that G has an identity element and that each element in G has an 
inverse. We reason as follows. Let c be an element in G, and e a solu- 
tion of yc = c. If ais any element in G and z is a solution of cr = a, 
then ea = e(cx) = (ec)x = cz = a, 80 € acts as an identity on the left. 
We still must show that ae = a. For any element b in G, denote a solu- 
tion of yb = e by b-! and call it a leftinverse of b. In particular, a~!a = e. 
It is clear that (a—'a)a—! = ea—! = a~!, so a@ (aa!) = a“; and if we 
multiply both sides of this on the left by a left inverse of a~1, we get 
aa-!=e. It is now easy to see that ae = a, for 


ae = a(a—a) = (aaa = ea = a. 


As the reader will observe, we have not only shown that e is an identity 
element, but we have also shown that each element has an inverse in the 
required sense. 

A subgroup of a group G is a non-empty subset H of G@ which is itself 
@ group with respect to the operation in G. It is easy to see that the 
identity e’ in H equals the identity e in G; for e’e’ = e’ = e’e, and by the 
cancellation law in G we have e’ = e. Also, if z is an element in H and 2’ 
is its inverse in H, so that rx’ = x'x = e, then z’ equals the inverse 1} 
of x in G; for zz’ = e and rz—! = e yield xz’ = xz, and another appli- 
cation of the cancellation law in G gives 2’ = z—!._ By these remarks, we 
see that a non-empty subset H of G is a subgroup of G © it is closed 
under multiplication, it contains the identity e of G, and it contains the 
inverse x—! of each of its elements x. Since rz~! = e, it is equally clear 
that a non-empty subset H of G is a subgroup of G + it is closed under 
multiplication and the formation of inverses. 

Many of the groups described above are subgroups of other groups. 
For instance, in Example 1 it is easy to see that (a) is a subgroup of (c), 
(c) of (b), (b) of (d), (d) of (e), and (e) of (f). The subgroups of Exam- 
ple 7 are of particular importance, and are called transformation groups. 
If the underlying set is finite, as in Example 8, a transformation group 
is often called a permutation group. In the case of Example 9, for 
instance, each symmetry can be regarded as a permutation of the num- 
bers which label the vertices of the square; e.g., the reflection H about the 
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horizontal axis interchanges 1 and 4, and also 2 and 3, so we may put 
1234 
aes ay 

The group of symmetries of a square, being a subgroup of the symmetric 
group S, of degree 4, is thus a permutation group. It is obvious that 
any group G has {e} and G itself as trivial subgroups. 

Every group in our list of examples is Abelian, with the exception 
of the last three. We ask the reader to show in Problem 9 that Example 7 
is non-Abelian whenever the set X contains more than two elements. 
It will follow from this that the symmetric group S, is non-Abelian when- 


ever n > 3. This can easily be seen for S, by computing the product 
qp, where p and gq are the permutations given in Example 8: 


7 Gar 
ia a) 
Since pq ¥ gp, S, is non-Abelian. We saw in Example 9 that RV = D. 
If we now compute VR, we get VR = D’,so RV ¥ VR and the group 
of symmetries of a square is also non-Abelian. 

When the theory of groups is studied for its own sake, the emphasis 
is usually placed on non-Abelian groups. The present section, however, 
is intended mainly to provide a proper foundation for our work in the 
rest of this chapter, and Abelian groups are the ones of greatest impor- 
tance for us. In the Abelian case, the multiplicative notation used above 
is often replaced by additive notation, in which the product zy is written 
xz + y and called the sum of x and y. Correspondingly, the identity is 
denoted by 0 instead of e and is called the zero element (or simply zero); 
and the inverse of x is denoted by —z instead of x! and is called the 
negative of x. Also, the operation of subtraction is defined by 


zr—-y=2z+(-y), 


and the element x — y is called the difference between + and y. An 
Abelian group in which this additive notation is used is called an additive 
Abelian group. It is clear that a subgroup of an additive Abelian group 
is a non-empty subset which is closed under addition and the formation 
of negatives. 


Problems 


1. Let G be a group, and show that (ry)! = y—42“! for any two ele- 
ments z and y in G. Show also that (2~!)-! = z for any element 
zinG. 
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Let G be a finite non-empty set which is closed under an associative 
multiplication with property (5). Show that Gisa group. (Hint: 
prove property (4) by considering the mappings of @ into itself 
defined by f(x) = az and g(y) = ya.) 

Prove that a group with the property that x? = e for every element 
x is necessarily Abelian (needless to say, z? is the conventional 
symbol for the product of z with itself: z? = zz). 

Prove that a group of order n with n < 4 is necessarily Abelian. 
Let H be a non-empty subset of a group G, and show that H is a 
subgroup of G — zy—1 is in H whenever x and y are. We see from 
this that a non-empty subset of an additive Abelian group is a 
subgroup © it is closed under subtraction. 

Let G be a group, and let C be the subset of G defined by 


C = {a:az = za for every x€ G}. 


Prove that C is a subgroup of G. C is called the center of G. 
Let m be a positive integer, consider the set 


In = {0,1,...,m— 1}, 


and define the “product” of any two numbers in it to be the re- 
mainder left when their ordinary product is divided by m. Con- 
struct a multiplication table similar to Fig. 28 for the non-zero 
elements of Is. Does this set with this operation form a group? 
Compute a similar table for the non-zero elements of I;. Does 
this system form a group? 

Introduce a symbol for each element of the symmetric group Ss; 
of degree 3, and construct a multiplication table for this group. 
Show that n! is the order of Sy. 

Show that the group of all one-to-one mappings of a non-empty 
set X onto itself is non-Abelian if X has more than two elements. 
Construct a multiplication table for the group of symmetries of a 
square. 

Let G and G’ be groups. A mapping f of G into G’ is called a homo- 
morphism if f(zy) = f(x)f(y) for all elements z and yinG. Assume 
that f is a homomorphism of G into G’, and prove the following facts: 
(a) f(e) =e’, where e and e’ are the identity elements in G and G’; 
(0) f(a) = f(z); 

(c) f(G) is a subgroup of G’; 

(d) f-*({e’}) is a subgroup of G. 

If a homomorphism is one-to-one, it is called an isomorphism. If 
there exists an isomorphism of G onto G’, then G is said to be isomor- 
phic toG’. To say that one group is isomorphic to another is to say 
that they have the same number of elements and the same group 
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structure, and differ only with respect to such inessentials as notation 
andterminology. The reader will observe that the function f defined 
on the real line by f(x) = a?, where a is a fixed real number greater 
than 1, is an isomorphism of the group of all real numbers with addi- 
tion onto the group of positive real numbers with multiplication, so 
that these two systems as groups are abstractly identical. Now let G 
be an arbitrary group, and let f be the mapping defined on G by 
f(a) = M., where M, is the mapping of G into itself given by 
M.(x) = az. Show that f is an isomorphism of G into the group of 
one-to-one mappings of G onto itself. This fact is called Cayley’s 
theorem, and it shows that from the abstract point of view the theory 
of groups is coextensive with the theory of transformation groups. 


40. RINGS 


We have seen that the set J of all integers is an additive Abelian 
group with respect to the operation of ordinary addition. It is just as 
important to observe that I is also closed under ordinary multiplication 
and that multiplication is linked to addition in a way which enriches the 
structure of the system as a whole. The theory of rings is the theory 
of such systems. 

A ring is an additive Abelian group R which is closed under a second 
operation called multiplication—the product of two elements x and y 
in R is written zy—in such a manner that 

(1) multiplication is associative, that is, if x, y, 2 are any three 

elements in R, then z(yz) = (ry)z; and 

(2) multiplication is distributive, that is, if x, y, z are any three ele- 

ments in R, then x(y + 2) = zy + vz and (x + y)z = zz + yz. 
In other words, a ring is an additive Abelian group whose elements can 
be multiplied as well as added, and in which multiplication behaves 
reasonably with respect to itself and addition. We note particularly 
that multiplication is not assumed to be commutative. 
Many of the additive Abelian groups listed in the previous section 
are also rings with respect to natural multiplications. 


Example 1. Each of the following rings consists of numbers, and addi- 
tion and multiplication are understood to have their ordinary meanings. 
(a) The single-element set containing only the number 0. 
(b) The set J of all integers. 
(c) The set of all even integers. 
(d) The set of all rational numbers. 
{e) The set R of all real numbers. 
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(f) The set C of all complex numbers. 
(g) The set of all complex numbers whose real and imaginary parts 
are both integers. 


Example 2. (a) @(X,R), with pointwise addition and multiplication. 
(b) @(X,C), with pointwise addition and multiplication. 


Example 3. Any ring of subsets of a set U, with addition and multi- 
plication defined by A + B = AAB and AB = A(\B (see Problems 
2-3 and 2-4). The fact that a ring of sets is a ring in our present sense 
is the reason for the name ring of sets. 


Example 4. Let m be a positive integer, and J,, the set of all non-nega- 
tive integers less than m: I, = {0, 1,...,m-—41}. Ifa and 6 are 
two numbers in J,,, we define their “sum” a + 6 and “product” ab to be 
the remainders obtained when their ordinary sum and product are 
divided by m. If m is 6, for instance, then J, = {0, 1, 2, 3, 4, 5}, and 
we have 3+4= 1 and 2:3=0. IJ, with these operations is called 
the ring of integers mod m. 


We now consider a general ring R. Many familiar facts from ele- 
mentary algebra are valid in R. Nevertheless, each must be proved on 
its own merits from the axioms or previous theorems, for one never knows 
when something which appears to be ‘‘obvious” will turn out to be false. 

We have already defined subtraction in any additive Abelian group 
by x —y =2x-+ (—y), and it is easy to show that such statements as 
—(¢-—y)=y—2z and x=y@x—y=0 are true. Problem 39-1 
assures us that —(—z) = zx. Andsoon. Properties of this kind relate 
to the additive structure of R and are comparatively trivial. It is only 
when we consider multiplication, and its relation to addition, that we 
begin to encounter some interesting situations. 

We illustrate this by proving that 20 = 0 for any element x in R. 
First, we have 

20 + <0 = +(0 + 0) = 20. 


Our next step is to add —z0 (the negative of x0) to both sides of this on 
the right, which gives 


(20 + x20) + (—20) = 20 + (—20); 

and by the associativity of addition we can write this in the form 
20 + (x0 + (—20)) = 20 + (—20). 

Since the sum of any element and its negative is 0, this collapses to 


20 +0 =0, 
which yields x0 = 0. 
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Similarly, 02 = Ofor any z. We see in this way that the product of two 
elements in a ring is zero whenever either factor is zero. We have given 
the details of the proof of this seemingly obvious fact because, surprisingly 
enough, its converse is false. It can perfectly well happen (and it often 
does happen) that the product of two non-zero elements in a ring is zero. 
The simplest examples of this phenomenon are found in the rings of 
integers mod m where m is greater than | and is not a prime number. 
We have already seen, for instance, that in I, the product of the two 
non-zero elements 2 and 3is0. An element z in a ring such that either 
zz = 0 for some non-zero x or yz = 0 for some non-zero y is called a 
divisor of zero. In any ring with non-zero elements, the element 0 itself 
is a divisor of zero. 

By using distributivity and the fact that the product of two ele- 
ments in the ring # is zero when either factor is zero, it is easy to verify 
the following familiar rules of calculation: z(—y) = (—2x)y = —vzy, 
(-—2)(-y) = zy, c(y — 2) = zy — xz, («x — ye = vz — yz. As a sim- 
ple consequence of the last two of these rules, we have the following 
cancellation law: if a is not a divisor of zero, then either of the relations 
ax = ay or xa = ya implies that x = y. 

R is called a commutative ring if xy = yz for all elements z and y. 
Every ring in the above list of examples is commutative. We shall 
encounter some non-commutative rings of very great importance 
in Secs. 44 and 45. 

If the ring R contains a non-zero element 1 with the property that 
zl = lx = x for every x, then 1 is called an identity element (or an iden- 
tity), and R is called a ring with identity. If a ring has an identity, then 
it has only one. In Example 1, only (a) and (c) have no identity. In 
both rings described in Example 2, the identity is the function which is 
identically 1. A ring of subsets of a set U has an identity = there exists 
a non-empty set in the ring which contains every set in the ring; in par- 
ticular, if U is non-empty and the ring is a Boolean algebra of subsets 
of U, then the set U itself is the identity. The ring J, has an identity 
om> 1. 

Let R be a ring with identity. If x is an element in R, then it may 
happen that there is present in R an element y such that zy = yx = 1. 
In this case there is only one such element, and it is written z—' and 
called the inverse of x. If an element z in R has an inverse, then x is 
said to be regular. Elements which are not regular are called singular. 
Regular elements are often called invertible elements, or non-singular 
elements. The element 0 is always singular in a ring with identity, and 
the element 1 is always regular. In Example 1b, 1 and —1 are the only 
regular elements; in 1d to If, all non-zero elements are regular; and in 
1g, the regular elements are 1,7, —1, and —#. 
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A ring with identity is called a division ring if all its non-zero ele- 
ments are regular. A field is a commutative division ring. The rational 
numbers constitute a field, as do the real numbers and the complex 
numbers. Roughly speaking, fields are the “number systems” of 
mathematics. 


Problems 


1, In the ring of even integers, why is 2 neither regular nor singular? 

2. Consider the ring of all subsets of a non-empty set U, with the 
operations defined in Example 3. What are the regular elements in 
this ring? What are the singular elements? What are the divisors 
of zero? Under what conditions is this ring a field? 

3. In each of the following rings of functions defined on the closed unit 
interval [0,1], describe the regular elements, the singular elements, 
and the divisors of zero: 

(a) all real functions; 

(b) all continuous real functions; 

(c) all bounded continuous real functions. 

What changes are necessary in these descriptions if [0,1] is replaced 
by (0,1)? 

4, Let R be a ring with identity, and show that any divisor of zero in 
R is singular. 

5. Let R bearing with identity, and show that R is a division ring = the 
non-zero elements of & form a group with respect to multiplication. 

6. Show that the ring J, is a field <= m is a prime number. (Hint: in 
showing that Jn is a field if m is prime, show first that in this case 
In has no non-zero divisors of zero, so that the non-zero elements 
of I», are closed under multiplication and the cancellation law 
ax = ay =x = y holds for these elements; now apply Problem 39-2 
and Problem 5 above.) 


41. THE STRUCTURE OF RINGS 


Let R be a ring. A non-empty subset S of F is called a subring 
of R if the elements of S form a ring with respect to the operations 
defined in R. This is equivalent to the requirement that S be closed 
under the formation of sums, negatives, and products. 

We concentrate our attention on a special type of subring. An deal 
in RF is a subring I of R which has the following further property: 


4¢ I= xi and iz eéI for every element 7 € KR. 
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It is in this sense that an ideal in R can be described as a subring of R 
which is closed with respect to multiplication on both sides by every 
element of R. If the ideal J is a proper subset of R, then it is called a 
proper ideal. The trivial ideals in R are the zero ideal {0} consisting of 
the zero element alone, and the full ring FR itself. We see from this that 
every ring with non-zero elements has at least two distinct ideals. 

In order to clarify the concept of an ideal, we mention a few specific 
examples. We begin by considering the ring of all integers. The even 
integers (i.e., all integral multiples of 2) obviously form an ideal in this 
ring. So also do all integral multiples of 3, of 4,and soon. In general, 
if m is any positive integer, then the set 


m={...,—2m, —m,0,m,2m,.. .} 


of all integral multiples of m is a non-zero ideal. We next consider the 
ring @[0,1] of all bounded continuous real functions defined on the closed 
unit interval. If X is a subset of [0,1], then the set 


I(X) = {f:f(z) = 0 for every xe X} 


is an ideal in this ring. It is easy to see that J(X) equals the full ring 
when X is the empty set and equals the zero ideal when X = (0,1). 
As a final example, we consider the ring of all subsets of an infinite set U, 
and we observe that the class of all finite subsets of U is a proper ideal in 
this ring. 

Some rings have a multitude of non-trivial ideals, while others have 
none at all. In general, the structure of a ring is very closely connected 
with the ideals in it. The following theorem illustrates this point. 


Theorem A. If R is a commutative ring with identity, then R is a field = 
it has no non-trivial ideals. 
PRooF. We first assume that F is a field, and we show that it has no 
non-trivial ideals. It suffices to show that if J is a non-zero ideal in FR, 
then I = R. Since J is non-zero, it must contain some element a ~ 0. 
R is a field, so a has an inverse a~!, and J (being an ideal) contains 
1 = a~'a. Since J contains 1, it also contains z = x1 for every z in R, 
and therefore J = R. 

We now assume that FR has no non-trivial ideals, and we prove that 
R is a field by showing that if z is a non-zero element in R, then z has an 
inverse. The set J = {yx:y¢R} of all multiples of z by elements of 
R is easily seen to be an ideal. Since J contains x = 12, it is a non-zero 
ideal, and it consequently equals R. We conclude from this that I 
contains 1, and therefore that there is an element y in 2 such that yz = 1. 
This shows that x has an inverse, so F is a field. 


The real significance of the ideals in a ring is that they enable us to 
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construct other rings which are associated with the first in a natural 
way. We explain how this is done. 

Let I be an ideal in a ring R. We use I to define an equivalence 
relation in R as follows: two elements z and y in FR are said to be con- 
gruent modulo I, written x = y (mod J), if x — yisinJ. Since only one 
ideal is under consideration, we abbreviate this symbolism to z = y. 
It is easy to verify that we actually do have an equivalence relation here, 
that is, that the following three conditions are satisfied: 

(1) 2x = x for every 2; 

(2) 2z=y>y =a; 

(3) xzx=yandy=z2>2 =z. 

Furthermore, congruences can be added and multiplied, as if they were 
ordinary equations: 

(4) 21 = a2 and y: = yo = a1 + yi = Xe + yo and Lyi = Toye. 
The hypothesis of (4) is that 71 — 22 and yi — y2 are elements of J, and 
since J is an ideal, the conclusions follow at once from 


(x1 + y1) — (2 + y2) = (1 — 22) + (y1 — Y2) 
and LY. — LeYe = L1Y1 — TiY2 + Liye — Toye 
= 21(y1 — yo) + (@1 — 22)y2. 


It will be necessary to use property (4) at a critical stage in our discussion 
below, and the reader will see there that this property is the main reason 
why the ideals in a ring are so much more important than its subrings. 

According to the general theory of Sec. 5, this equivalence relation 
has associated with it a partition of R into equivalence sets—called 
cosets in this context—which are non-empty and disjoint and whose union 
is the full ring R. What is the structure of these cosets? In order to 
answer this question, we let x be an element of R. The coset [z] con- 
taining z is by definition the set of all elements y such that y = 2; that 
is, [cx] = {y:y =x}. But 


fy:y —xeT} 
{y:y — x =7 for someieT} 
{y:y = x + 7 for someié J} 
{a +i:teT}. 


A natural notation for the set last written is x + I, which we understand 
to signify the set of all sums of z and elements of J. The structure of the 
coset [zx] containing z is fully exhibited by the fact that [z] = z+ I. 
Sometimes it is convenient to denote this coset by [x] and sometimes 
by z+J. We recall that the same coset can perfectly well arise from 
another element, say z:, and that [z] = [z:] means that z = x, that is, 
that z — z,isinIJ. The elements z and z, are called representatives of the 
coset which contains them. 


fy:y = 2} 
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Our next step is to construct a new ring, which we denote by R/T and 
call the quotient ring of R with respect to J. The elements of the ring 
R/TI are the distinct cosets of the form [z] (orz + J). All that remains 
is to define the manner in which these cosets are to be added and multi- 
plied and to verify the fact that we do indeed obtain a ring. The 
definitions are as follows: 


[x] + [y} = [x + y] 
and [x] - [y) = [xy]. 


In other words, we add and multiply two cosets [x] and [y] by first adding 
and multiplying the representatives z and y, and then by forming the 
cosets which contain zs + y and zy. It is necessary to make certain that 
these are legitimate definitions, that is, that the resulting cosets [z + y] 
and [zy] do not depend on the particular representatives x and y chosen 
for the cosets [x] and [y]. To this end, we take two other representatives 
of the same two cosets, i.e., two other elements x, and y; of R such that 
2%, =z and y: = y. We want to satisfy ourselves that 


[z+ yi] = [z+ y] 


and [x.y:] = [zy], or equivalently, that 2. + y. = 2+ y and ry = zy. 
Since this is precisely the content of property (4) above, we do have 
valid definitions for our ring operations in R/IJ. We omit the detailed 
verification of the fact that R/I with these operations actually is a ring, 
remarking only that the zero element of this ring is [0] = 0 + J = I and 
that the negative of a typical element {z] = « + J is [—z] = (—2z) + I. 
It is easy to see that R/I is commutative if R is, and that if R has an 
identity 1 and J is a proper ideal, then R/I has an identity 1 + I. 

We summarize the results of this discussion in the following theorem. 


Theorem B. Let I be an ideal in a ring R, and let the coset of an element x 
in R be defined by x + I = {2 + i:¢¢I}. Then the distinct cosets form a 
partition of R; and if addition and multiplication are defined by 


@+DN+y+D=+@+y)+1 
and @+Dy+N=czt+Tl, 


then these cosets constitute a ring denoted by R/I and called the quotient 
ring of R with respect to I, in which the zero element is 0 + I = I and the 
negative of x + Tis (—2z) +I. Further, if R is commutative, then R/I is 
also commutative; and if R has an identity 1 and I is a proper ideal, then 
R/T has an identity 1 + I. 


We now give a brief account of homomorphisms and of the manner 
in which ideals, quotient rings, and homomorphisms are all related to one 
another. 


188 Operators 


Let R and R’ be two rings. A homomorphism of R into R’ is a 
mapping f of R into R’ with the following two properties: 


fa t+y) =f) +f) 
and f(xy) = f(z)fty). 


A homomorphism of one ring into another is thus a mapping of the first 
ring into the second which preserves the ring operations. It is easy to 
see that f preserves zero in the sense that f(0) = 0;! for 


J(0) + FO) = fO + 0) = fO), 


and subtracting f(0) from both sides yields our result. Similarly, f 
preserves negatives, for f(—z) = —f(z) follows from 


f(z) + f(—2) = f(z + (—2)) = fO) = 0. 


The image f(R) of R under f is clearly a subring of R’. This subring 
f(R)—which is R’ itself when f is onto—is called a homomorphic image 
of R. If the homomorphism f is one-to-one, then it is called an iso- 
morphism, and the subring f(R) is called an isomorphic image of R. An 
isomorphic image of R can be thought of as a ring which is essentially 
identical with R, for it differs from R only in the matter of notation. 
The properties of R are reflected with complete precision in an isomorphic 
image and with somewhat less precision in a homomorphic image. 

Let f be a homomorphism of R into R’. The kernel K of this 
homomorphism is the inverse image in FR of the zero ideal in R’: 


K = {x:2¢R and f(z) = 0}. 


It is easy to see that K is an ideal in R, and also that K is the zero ideal 
in R fis an isomorphism. We leave these verific2’’ ns to the reader. 
In a sense, the size of the kernel K is a measure of the extent to which 
f fails to be an isomorphism. 

What is the real significance of homomorphisms, homomorphic 
images, and kernels? A full answer to this question would carry us into 
the utmost reaches of the general theory of rings, where we have no 
intention of treading. We will, however, attempt a brief and neces- 
sarily vague partial answer. Suppose that F is a ring whose features are 
unfamiliar, whose structure is unknown. The question confronting us is, 
What is the nature of R? And, as is often the case in mathematics, if 
we can state adequately what this question means, we will have taken 
a long step toward answering it. Suppose now that R’ is a homomorphic 
image of R and that R’ is a well-known ring which is intuitively familiar 


1 We use the symbol 0 here to designate two different elements: the zero in R 
and the zero in 2’. This convention is customary, and it saves more trouble than it 
causes. 
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and thoroughly understood. R’ then provides a picture of the structure 
of R. The details of this picture may be blurred and fragmentary, but 
we can usually glean from them a few hints as to the nature of 2 itself. 
If we have available many homomorphic images of R, it is often possible 
to correlate the hints we get from these many sources in such a way as to 
build up a fully detailed and completely precise picture of the original 
ring R. This is the overall strategy in 
the structure theory (or representation 
theory) of rings.! The relevance of 
ideals to this strategic pattern depends 
on the following fact: all possible homo- 
morphic images of R can be constructed 
by means of the ideals in R. We next 
describe how this is accomplished. 

Let R be a ring, and let f be a 
homomorphism of R onto a ring R’. 
Let K be the kernel of f. Since K is 
an ideal in R, we can form the quo- 
tient ring R/K. We now observe Fig. 30 
that R/K is a homomorphic image 
of R under the homomorphism g—called the natural homomorphism 
—defined by 


g(x) =x+ K. 


The fact that g is a homomorphism follows directly from the definition 
of the ring operations in R/K: 


gzt+y=(@+y+K =(@+K)+y4+4) 
= g(x) + gly) 

(2 + K)(y + K) 

= g(x)g(y). 


and g(zy) =ayt+K 


Finally, we show that R/K and R’ are essentially identical by producing 
an isomorphism of R/K onto R’. Let a mapping h be defined on R/K by 
h(x + K) = f(x). We leave it to the reader to verify that A is a well- 
defined mapping of R/K onto R’ and is also an isomorphism. Figure 30 
gives a schematic representation of this situation. Since R/K and R’ are 
isomorphic, we can replace R’ in any discussion by its replica R/K. It is 


1The procedure described here is loosely similar to a familiar technique from 
three-dimensional analytic geometry, in which the form of a curved surface is studied 
by means of its cross sections. The information obtainable from any given cross 
section is meager, but an intelligent consideration of all the successive cross sections 
can yield a satisfactory mental image of the surface as a whole. 
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therefore unnecessary to go beyond the ring F to find all its homomorphic 
images. 

If J is an ideal in a ring R, then its properties relative to all of R are 
reflected in corresponding properties of the quotient ring R/J, and many 
aspects of the study of R depend on the presence in it of ideals whose 
corresponding quotient rings are simple and familiar. 

In order to illustrate this fundamental principle, we introduce the 
following concept. An ideal J in a ring R is said to be a maximal ideal if 
it is a proper ideal which is not properly contained in any other proper 
ideal. Our next theorem is an immediate consequence of this concept and 
Theorem A. 


Theorem C. If R is a commutative ring with identity, then an ideal I in 
Ris maximal = R/T is a field. 

PROOF. We first observe that if J is maximal, then R/TJ is a commutative 
ring with identity in which there are no non-trivial ideals, so by Theorem 
A it follows that R/I is a field. We now assume that J is not maximal, 
and we show that R/I is not a field. There are two possibilities: (a) 
that J = R, and (6) that there exists an ideal J such that! CJC R. 
In case (a), R/T has no non-zero elements, so it cannot be a field. In 
case (b), R/I is a commutative ring with identity which contains the 
non-trivial ideal J/I, so again it cannot be a field. 


The commutative rings we study in later chapters have a great 
many distinct maximal ideals, and this theorem will serve us well in our 
program of analyzing the structure of these rings. 


Problems 


1. Let R be a ring with identity which is not necessarily commutative. 
In view of Theorem A, it is natural to conjecture that F is a division 
ring © it has no non-trivial ideals. Try to prove this conjecture by 
the method used in the proof of Theorem A. At what precise point 
does this attempted proof break down? How much of the con- 
jecture can you prove? 

2. Let R be the ring of all real functions defined on the closed unit 
interval [0,1]. If X is a subset of [0,1], show that the ideal /(X) in 
R defined by I(X) = {f:f(z) = 0 for every z€ X} is maximal = X 
consists of a single point. 

3. Let J be the ring of integers and m a positive integer. It is easy to 
see that if z is any integer, then z can be represented uniquely in the 
form zx = gm +71, where g and r are integers and r is in the set 
{0,1,...,m—1}. Use this fact to show that a non-zero ideal 
in J is necessarily of the form m = {. . . , —2m, —m,0,m,2m, .. .} 


Algebraic Systems 191 


for some positive integer m. Show that the mapping f defined on I 
by f(z) = ris a homomorphism of I onto the ring I,, of integers mod 
m. Show that the kernel of this homomorphism is the ideal m, so 
that the quotient ring 7/7 is isomorphic to J,, and conclude from 
this that m is maximal © m is a prime number. 


42. LINEAR SPACES 


We introduced linear spaces in Sec. 14, and we also mentioned a few 
of their simpler properties. Our present purpose is to develop the 
theory of these systems in somewhat greater detail. 

We begin by restating the definition in terms of concepts now availa- 
ble to us. The reader will recall that by the scalars we mean either the 
system of real numbers or the system of complex numbers. A linear 
space (or vector space) is an additive Abelian group L (whose elements 
are called vectors) with the property that any scalar a and any vector z 
can be combined by an operation called scalar multiplication to yield a 
vector az in such a manner that 

(1) af + y) = at + ay; 

(2) (a+ 8)x = ax + Bz; 

(3) (a8)z = a(Bz); 

(4) l-x2=a. 

A linear space is thus an additive Abelian group whose elements can be 
multiplied by numbers in a reasonable way, but not necessarily by one 
another (as in the case of rings). The two primary operations in a 
linear space—addition and scalar multiplication—are called the linear 
operations, and its zero element is usually referred to as the origin. 

A linear space is called a real linear space or a complex linear space 
according as the scalars are the real numbers or the complex numbers. 
The advantage of calling the numerical coefficients scalars is that we 
avoid committing ourselves to either the real case or the complex case 
and are free to develop the theory for both simultaneously.! In later 
chapters we shall be concerned exclusively with complex linear spaces, 
but for the present we prefer to leave the door open. 

Before proceeding to the general theory of linear spaces, we list a 
few examples. 


1In some approaches to the theory of linear spaces, the system of scalars is 
allowed to be an arbitrary field. This degree of generality is unnecessary for our 
purposes; and since unnecessary generality is undesirable, we limit ourselves accord- 
ingly. Even further, the system of scalars can be taken to be an arbitrary ring. In 
this case, one speaks of a module instead of a linear space. Modules are of great 
importance in the structure theory of rings. 
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Example 1. The set # of all real numbers, with ordinary addition and 
multiplication taken as the linear operations, is a real linear space. 


Example 2. The set * of all n-tuples of real numbers is a real linear 
space under the following coordinatewise linear operations: if 


t= (xi, V2, - 2 + y Zn) and y= (y1, Y2 ee ey Yn), 
then ety = (tit yi, Te + yo, . . ~ Sn + Yn) 
and ax = (a%1, A%2, ... , AXn). 


This reduces to Example 1 when n = 1. 


Example 3. The set @(X,/t) of all bounded continuous real functions 
defined on a topological space X is a real linear space under the following 
pointwise linear operations: if f and g are functions in C(X,R), then 
f +4 and of are defined by 


(f + g)(x) = f(x) + g(z) 
and (of)(x) = af(z). 


Example 4. The set C of all complex numbers is a complex linear space 
under ordinary addition and multiplication. 


Example 5. The set C* of all n-tuples of complex numbers is a complex 
linear space with respect to the coordinatewise linear operations defined 
in Example 2. This reduces to Example 4 when n = 1. 


Example 6. The set C(X,C) of all bounded continuous complex func- 
tions defined on a topological space X is a complex linear space with 
respect to the pointwise linear operations defined in Example 3. 


Example 7. Let P be the set of all polynomials, with real coefficients, 
defined on the closed unit interval [0,1]. We specifically include all 
non-zero constant polynomials (which have degree 0) and the polynomial 
which is identically zero (this has no degree at all). If the linear opera- 
tions are taken to be the usual addition of two polynomials and the 
multiplication of a polynomial by a real number, then P is a real linear 
space. 


Example 8. For a given positive integer n, let P, be the subset of P 
consisting of the polynomial which is identically zero and all polynomials 
of degree less than n. P, is a real linear space with respect to the linear 
operations defined in P. 


Example 9. A linear space may consist solely of the vector 0, with 
scalar multiplication defined by a-0 = 0 for all a. We refer to this as 
the zero space, and we always denote it by {0}. 
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These examples are typical of the spaces which will concern us and 
give ample scope for the illustration of all the important phenomena. 
There are a number of other linear spaces of great interest, and we men- 
tion some of these from time to time in later chapters. For the present, 
however, the above list will suffice. 

We saw in Sec. 14 that in any linear space we have a:0 = 0, 
0-2 = 0,and(—1)z = —z. Itisalso easy to show thatary = O>a = 0 
or z = 0; for if a #0, then multiplying both sides of az = 0 by a“! 
yields a~'(ar) = a7! - 0, (a a)z = 0, 1-2 = 0, and finally, z = 0. 

We now turn to the general theory of an arbitrary linear space L. 

A non-empty subset M of L is called a subspace (or a linear subspace) 
of L if M is a linear space in its own right with respect to the linear opera- 
tions defined in L. This is clearly equivalent to the condition that M 
contain all sums, negatives, and scalar multiples of its elements; and 
since —x = (—1)z, this in turn is equivalent to the condition that M be 
closed under addition and scalar multiplication. If the subspace M is a 
proper subset of L, then it is called a proper subspace of L. The zero 
space {0} and the full space L itself are always subspaces of L. Among 
our examples, P, is a subspace of P for each positive integer n, and 
P is a subspace of €[0,1]. Also, the following are easily seen to be sub- 
spaces of R?: 


M, 
and M, 


{ (a, 0, 0)}, M; {(0, x2, 0)}, M; {(0, 0, xs)}, 
{(O, x2, x3)}, Ms = {(x1, 0, 23)}, Me = {(m, 2, 0)}. 


The subspaces M,, M2, M; are usually called the coordinate azes in solid 
analytic geometry, and My, Ms, M, are called the coordinate planes. 
The most general non-zero proper subspace of #* is a line or a plane 
through the origin. 

If M is a subspace of L, then—just as in the case of an ideal in a 
ring—we can use M to define an equivalence relation in Z as follows: 
x = y (mod M) means that z — yisin M. The discussion leading up to 
Theorem 41-B can be repeated without essential change (but with con- 
siderable simplification) to yield the concept of the quotient space L/M of 
L with respect to M. We give a formal statement of the basic facts in 
the following theorem. 


Theorem A. Let M be a subspace of a linear space L, and let the coset of 
an element x in L be defined by x + M = {x-+m:meM}. Then the 
distinct cosets form a partition of L; and if addition and scalar multiplica- 
tion are defined by 


(@+M)+y+M)=@t+y)+M 
and a(2 + M) =art+M, 
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then these cosets constitute a linear space denoted by L/M and called the 
quotient space of L with respect to M. The origin in L/M is the coset 
0+ M = M, and the negative of x + M is (—x) + M. 


The proof of this theorem is routine, and we leave the details to the 
reader. 

It is worth remarking that the concept of a quotient space has a 
simple geometric interpretation. To bring this out most clearly, we let 
L be the linear space R? and M the subspace indicated in Fig. 31. If we 


R (x+M)+(y+M) 
=(xty)+M 


Fig. 31. Addition in a quotient space. 


think of the vectors in L as the heads of arrows whose tails are at the 
origin, then the non-zero proper subspace M is a straight line through the 
origin, a typical coset + + M is a line parallel to M, and L/M consists of 
all lines parallel to @. We add two cosets x + M and y + M by adding 
x and y and by forming the line (x + y) + M through the head of x + y 
and parallel to . Scalar multiplication is carried out similarly. 

The subspaces of our linear space L can be characterized conveniently 


as follows. If {21, v2, . .. , tn} is a finite non-empty set of vectors in 
L, then the vector 

X= O12, + aod, + ++ * + Onn 
is called a linear combination of x1, %, ...,2n. It is evident that a 


subspace of L is simply a non-empty subset of L which is closed under 
the formation of linear combinations. If S is an arbitrary non-empty 
subset of L, then the set of all linear combinations of vectors in S is 
clearly a subspace of L; we denote this subspace by [S], and we call it 
the subspace spanned by S. Since [S] is a subspace which contains S and 
is contained in every subspace which contains S, we may think of [S] as 
the smallest subspace which contains S. If M is a subspace of L, then 
a non-empty subset S of M is said to span M if [S] = M. 

Suppose now that M and N are subspaces of L, and consider the set 
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M +N of all sums of the form z + y, where r¢ M and ye N. Since 
M and N are subspaces, it is easy to see that M + N is the subspace 
spanned by all vectors in M and N together, ie., that MW + N = (MUN). 
If it happens that M + N = L, then we say that L is the sum of the 
subspaces M and N. This means that each vector in L is expressible as 
the sum of a vector in M and a vector in N. The case in which even 
more is true—namely, that each vector z in L is expressible uniquely in 
the form z = x + y, with x¢ M and y ¢ N—will be of particular impor- 
tance for us. In this case we say that L is the direct sum of the subspaces 
M and N, and we symbolize this statement by writing L = M @ N. 


Theorem B. Let a linear space L be the sum of two subspaces M and N, 
sothtL=M+N. ThnL=M@®@N&MOAN = {0}. 

PRooF. We begin by assuming that L = M @®N, and we deduce a 
contradiction from the further assumption that there is a non-zero 
vector z in M(\N. It suffices to observe that z is expressible in two 
different ways as the sum of a vector x in M and a vector y in N, for 
z=z+0 (herex = zandy = 0) andz =0+<2 (herez = Oandy = 2). 
This contradicts the uniqueness required by the assumption that L = 
MON. 

We now assume that M(\N = {0}, and we show that it follows 
from this that L = M+ WN can be strengthened to L= MON. 
Since L = M + N, each z in L can be written in the form z = x + y with 
x¢M and yéN. We wish to show that this decomposition is unique. 
If we have two such decompositions of z, so that z = 21 + y1 = %2 + Y2, 
then 2; — 22 = y2 — yi. The left side of this is in M, the right side is 
in N, and they are equal; it therefore follows from M (\ N = {0} that 
both sides are 0, that 2, = x2 and y; = ye, and that the decomposition 
of z is unique. 


The condition in this theorem—that the subspaces M and WN have 
only the origin in common—is often expressed by saying that M and N 
are disjoint. There is fortunately little danger of confusing this with the 
set-theoretical notion of disjointness, for a subspace of a linear space 
always contains the vector 0, so the intersection of any two must also 
contain this vector and they can never be disjoint in the set-tneoretical 
sense. 

The concept of a direct sum can easily be broadened to allow for three 


or more subspaces. If Mi, Mo, ...,M,n (n > 2) are subspaces of L, 
then the statement that L is the direct sum of the M,’s—written 
L=M:0M.0::: OM, 


—means that each vector z in L can be represented uniquely in the form 
2=2,+22+ :+* +2, where z;¢ M; for every 7. The reader will 
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observe that R* can be represented in various ways as direct sums of the 
coordinate axes and coordinate planes mentioned above: 


R?=M,0M.0M;=M,0M,=M.0M, = M3 @ Mg. 


We shall often have occasion in the following chapters to study problems 
which are intimately concerned with the representation of a linear space 
as the direct sum of certain of its subspaces. 


Problems 


1. Each of the following conditions determines a subset of the real 
linear space R? of all triples x = (21, 22, x3) of real numbers: (a) 2; is 
an integer; (b) 21 = 0 or r2 = 0; (c) 21 + 2x2 = 0; (d) 2, + 2ae = I. 
Which of these subsets are subspaces of 23? 

2. Each of the following conditions determines a subset of the real 
linear space C[— 1,1] of all bounded continuous real functions y = f(x) 
defined on (—1,1]: (a) f is differentiable; (b) f is a polynomial of 
degree 3; (c) f is an even function, in the sense that f(—zx) = f(x) for 
all x; (d) f is an odd function, in the sense that f(—xz) = —f(x) for 
all x; (e) f(0) = 0; (f) f) = 1; @) f(x) = 0 for all xz. Which of 
these subsets are subspaces of @[—1,1]? 

3. In the preceding problem, show that C[—1,1] is the direct sum of the 
subspaces defined by conditions (c) and (d). (Hint: observe that 
f(z) = [f(@) + f(—2)]/2 + [f(@) — f(—2)]/2.) 

4. Leta linear space L be the sum of certain subspaces M1, M2, ... , 
M, (n > 2), and show that L is the direct sum of these subspaces 
© each M; is disjoint from the subspace spanned by all the others. 
The latter condition clearly implies that each M, is disjoint from each 
of the others. Show that the converse of this statement is false by 
exhibiting three subspaces 1/;, M2, M; of R? such that 


Mi 0\ M. = M,1\ M3; = M.1\M; = {0} 
and Mi, (M, + M3) ¥ {0}. 


43. THE DIMENSION OF A LINEAR SPACE 


Let L be a linear space, and let S = {21, 22, ... , tn} be a finite 
non-empty set of vectors in L. S is said to be linearly dependent if there 
exist scalars a1, a2, . . . , @n, not all of which are 0, such that 


OD, + ate + ++ + + ant, = 0. (1) 


If S is not linearly dependent, then it is called linearly independent; and 
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this clearly means that if Eq. (1) holds for certain scalar coefficients 
@1, O2,..., Qn, then all these scalars are necessarily 0. In other 
words, S is linearly independent if the trivial linear combination of its 
vectors (with all scalar coefficients equal to 0) is the only one which equals 
0, and it is linearly dependent if some non-trivial linear combination of its 
vectors equals 0. In either case, as we know, the vectors in the subspace 
[S] spanned by S are precisely the linear combinations 


v= a2, + Ale + al de + AnLy (2) 


of the x,’s. The significance of the linear independence of S rests on the 
fact that if S is linearly independent, then each vector z in [S] is uniquely 
expressible in this form; for if we also have 


% = Biti + Bote + ++ + Bada, (3) 
then subtracting (3) from (2) yields 
(ay — Bit. + (ae — Brtet ore + (Qn — Bn)tn = 0, 


from which—by the linear independence of S—we obtain a; — 8; = 0 or 
a; = 8, for every 7. Further, the linear independence of S not only 
implies this uniqueness, but is also implied by it, for the statement that 
the vector 0 in [S] is uniquely expressible in the form 


0=0-27,+0-2%2+ pees + 0:2, 


is exactly what is meant by the linear independence of S. 

It is necessary to extend these concepts to cover the case of an 
arbitrary non-empty set of vectors in ZL. We shall say that such a set is 
linearly independent if every finite non-empty subset is linearly inde- 
pendent in the sense of the above paragraph; otherwise, it is said to be 
linearly dependent. Just as in the finite case, an arbitrary non-empty 
subset S of L is linearly independent = each vector in the subspace [8S] 
spanned by S is uniquely expressible as a linear combination of the vectors 
in S. We are particularly interested in linearly independent sets which 
span the whole space L. Such a set is called a basis for L. It is impor- 
tant to observe that if S is a linearly independent subset of L, then S is a 
basis for L © it is maximal with respect to being linearly independent, in 
the sense that every subset of ZL which properly contains S is linearly 
dependent. 

Our first theorem assures us that if a linearly independent set is not 
already a basis, then it can always be enlarged to form a basis. 


Theorem A. If S is a linearly independent set of vectors in a linear space 
L, then there exists a basis B for L such that S € B. 


ProoF. Consider the class P of all linearly independent subsets of L 
which contain S. P is clearly a partially ordered set with respect to set 
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inclusion. It suffices to show that P contains a maximal set B, for such a 
maximal set will automatically be a basis for Z such that SC B. By 
Zorn’s lemma, it suffices to show that every chain in P has an upper 
bound in P. But this is evident from the fact that the union of all the 
sets in any chain of linearly independent sets which contain S is itself a 
linearly independent set which contains S. 


A linearly independent set is non-empty by definition, and it clearly 
cannot contain the vector 0. We see from this that if our linear space 
L is the zero space {0}, then no subset of L is linearly independent and L 
has no basis. On the other hand, if LZ ~ {0} and z is a non-zero vector 

in L, then the single-element set {z} 
R? is linearly independent and Theorem 
A guarantees that L has a basis which 
contains {x}. This proves 


Theorem B. Every non-zero linear space 
has a basis. 


Since any single-element set con- 
sisting of a non-zero vector can be 
enlarged to form a basis, it is evident 
that any given non-zero linear space 
has a great many different bases. 
Fig. “32; ‘Two bases feed} ‘and In R&?, for instance, the vectors 
{fuse} for R?. é: = (1,0) and e2 = (0,1) form a basis, 

as do hi = (1,1) and fe = (0,—1) 
(see Fig. 32). If we think of a vector as an arrow whose tail is the origin, 
it is fairly clear on geometrical grounds that in this space any two non- 
zero vectors form a basis if they are not collinear. We bring order out of 
this apparent chaos by proving in several stages that any two bases in a 
non-zero linear space have the same number of elements. Our next 
theorem is the first step in this process. 


Theorem C. Let S = {21, to, . . . , tn} be a finite non-empty set of vec- 
tors in a linear space L. If n = 1, then S is linearly dependent = x; = 0. 
Ifn > land x, ¥ 0, then S ts linearly dependent = some one of the vectors 


Zo, ... ,%n 18 a linear combination of the vectors in S which precede it. 
proor. The first statement is obvious, so we assume that n > 1 and 
that x; #0. It is easy to see that if one of the vectors 72, ... , 2, is a 


linear combination of the preceding ones, then the equation expressing 
this fact can be rewritten in the form of Eq. (1) in such a way that the 
coefficient of the vector in question is 1, so S is linearly dependent. We 
now assume that S is linearly dependent, so that Eq. (1) holds with at 
least one non-zero coefficient. If a; is the last non-zero coefficient, then 
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# > 1 (since x; ~ 0) and Eq. (1) can be rewritten in such a way that 2; is 
exhibited as a linear combination of 21, ... , t:-1 (with coefficients 
—a1/ai, oe a yg —ai_1/ 04). 


We next prove the following restricted form of our main theorem. 


Theorem D. Let L be a non-zero linear space. If L has a finite basis 
B, = {e:} = {e1, e2, ..., @n} with n elements, then any other basis 
B. = {f;} is also finite and also has n elements. 

proor. To show that B; is finite, we assume that it is not, and we deduce 
a contradiction from this assumption. We first observe that each e; is a 
linear combination of certain f;’s and that all the f,’s which occur in this 
way constitute a finite subset S of B.. Since B, is assumed to be infinite, 
there exists a vector f;, in Bz which is not in S. But f;, is a linear com- 
bination of the e,’s, and therefore of the vectors in S. This shows that 
SU {f,,} is a linearly dependent subset of Bz, which contradicts the fact 
that B, is a basis. 

Since the basis B, is finite, it can be written in the form 


By, = {fi} as {hi fo, a Im} 


for some positive integer m. We must now show that m and 7 are equal, 
and this we do as follows. Since the e,’s span L, f; is a linear combination 
of the e,’s, and the set S, = {fi, e1, @2, . . . , én} is linearly dependent. 
We know by Theorem C that one of the e,’s, say e;,, is a linear combination 
of the vectors in S, which precede it. If we delete e;, from Si, then the 


remaining set S2 = {f1, ¢1,... , Cit, @iot1, - - +, Cn} still spans L. 
Just as before, fo is a linear combination of the vectors in Sz, so the set 
Ss = {fi, fo, €1, . . - 5 City Ciotty » + +» Cn} is linearly dependent. 


Another application of Theorem C shows that some vector in S; is a 
linear combination of the preceding ones; and since the f;’s are linearly 
independent, this vector must be one of the e,’s. If we delete this vector, 
then the remaining set again spans L. If we continue in this way, it is 
clear that we cannot run out of e,’s before the f;’s are exhausted; for if we 
do, then the remaining /;’s are linear combinations of those already used, 
which contradicts the linear independence of the f;’s. This shows that 
n is not less than m, or equivalently, that m <n. If we reverse the 
roles of the e,’s and f;’s, then precisely the same reasoning yields n < m, 
from which we conclude that m = n. 


We are now in a position to prove our main theorem in its full 
generality. 


TheoremE, Let L be a non-zero linear space. If B, = {e:} and Bz, = { f;} 
are any two bases for L, then B, and Bz have the same number of elements 
(that is, the same cardinal number). 
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proor. If either B; or B; is finite, then by the preceding theorem the 
other is also finite, and they have the same number of elements. We may 
therefore confine our attention to the case in which both are infinite. 

Since B, is a basis, each e; can be expressed uniquely as a linear 
combination (with non-zero coefficients) of certain f,’s: 


es = aif; + arofi, + > + + anf, 


Further, every f; occurs in at least one such expression, for if a certain one, 
say f;,, does not, then since B, is a basis, f;, is a linear combination of cer- 
tain e,’s, and therefore of certain f,’s ~ f;,which contradicts the fact 
that the f,’s are linearly independent. This process associates with 
each e, a finite non-empty set F., of f;’s, and in such a way that B, = 
Uses, Fo, Let mi and ne be the cardinal numbers of B; and Bz, and let 
nm be the cardinal number of the indicated union. It follows from the 
above set equality that nz = n, and Problem 8-10 shows that n < nu, 
SO m2 <n. If we reverse the roles of the e,’s and f;’s, then in the same 
manner we obtain n; < m2, from which we conclude that n; = no. 


These theorems enable us to define the dimension of an arbitrary 
linear space L. If L = {0}, then it is said to be 0-dimensional, or to 
have dimension 0; and if L ~ {0}, then its dimension is the number of 
elements in any basis. A linear space is called finite-dimensional if its 
dimension is 0 or a positive integer, and infinite-dimensional otherwise. 
We can now justify the usual practice of calling R" and C* n-dimensional 
spaces by exhibiting the following n vectors as a basis for both spaces: 


a= (1, 0, 0, ’ 0), 

é = (0, 1, 0, ’ 0), 

6 = (0, 0, 1, ’ 0), 

en = (0, 0, 0, , 1) 
It is easy to see that P, is also n-dimensional, for the polynomials 1, z, 
z?, ..., 2") constitute a basis for this space. Similarly, the set 
{1, z, 22, ...,2", . . .} is a basis for P, so this space is infinite-dimen- 


sional. More precisely, the dimension of P is No. 

The existence of a basis for an arbitrary non-zero linear space, and 
the fact that the number of elements in a basis is a constant determined 
only by the space, can also be used to give a simple but complete structure 
theory for these spaces. We proceed as follows. 

Let L and L’ be linear spaces with the same system of scalars. An 
isomorphism of L onto L’ is a one-to-one mapping f of L onto L’ such 
that f(x + y) = f(z) + f(y) and f(ax) = af(x); and if there exists such 
an isomorphism, then L is said to be isomorphic to L’. To say that one 
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linear space is isomorphic to another is to say, in effect, that they are 
abstractly identical with respect to their structure as linear spaces. 

Now let L be a non-zero finite-dimensional linear space of dimension 
n, and let B = {e1, e2, . . . , én} be a basis for L whose elements are 
written in a definite order as indicated by the subscripts. Each vector 
zx in L is uniquely expressible in the form 


X= aye: + are2 + + + + + nen, 
so the n-tuple (a1, a2, . . . , an) of scalars is uniquely determined by z. 
If we define a mapping f by f(x) = (a1, a2, . . . , an), then it is easy to 


see that f is an isomorphism of Z onto R* or C* according as L was real 
or complex to begin with. It should be recognized that the isomorphism 
fis by no means unique, for if some other basis is chosen for L, or if the 
order of the elements in the basis B is altered, then the resulting iso- 
morphism of L onto R* or C” will clearly be different from f. We sum- 
marize these remarks in 


Theorem F. Let L be a non-zero finite-dimensional linear space of dimen- 
sion n. If L is real, then it is isomorphic to R"; and if it is complex, then 
at is isomorphic to C*. 


This theorem can easily be extended to the case of an arbitrary non- 
zero linear space. We begin by describing the concrete linear spaces 
which will replace R” and C™ in our generalized form of Theorem F. 
Let X be an arbitrary non-empty set, and denote by L(X) the set of all 
scalar-valued functions defined on X which vanish outside finite sets. 
Addition and scalar multiplication for such functions are understood to 
be defined pointwise, and L(X) is obviously a non-zero linear space which 
is real or complex according as the functions considered are real or com- 
plex. Our purpose is to show that these spaces are universal models for 
non-zero linear spaces, in the sense that an arbitrary non-zero linear 
space L is isomorphic to some L(X). We start by choosing a basis 
B = {e,;} for L, and we let B be the set X. We next establish an iso- 
morphism of L onto L(B) by making correspond to each vector x in L a 
scalar-valued function f, defined on B. If x = 0, then f, is defined by 
f-(e:) = 0 for every e, in B. If x #0, it is uniquely expressible in the 
form 


v= 164i, + 2e;, a SR Onli, 


with non-zero coefficients; and f, is defined by f.(e;) = 0 outside the set 
{@i, a)». , @,} and by f.(e;,) = a; inside this set. It is trivial to 
verify that the mapping we have described is an isomorphism of L onto 
L(B). This discussion yields 
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Theorem G. Let L be a non-zero linear space. If Bis a basis for L, then 
L is isomorphic to the linear space L(B) of all scalar-valued functions defined 
on B which vanish outside finite sets. 


Theorems F and G are of considerable interest in that they reveal 
what simple things linear spaces really are. The reader may well feel, in 
the light of these results, that the concept of an abstract linear space has 
served its purpose and should now be abandoned, and that all further 
study of linear spaces should be directed specifically at the L(X)’s, or 
in the finite-dimensional case, at R" and C*. There are at least two 
reasons why this is not a useful point of view. One of these lies in the 
fact that the above isomorphisms were established by arbitrarily choosing 
one particular basis B for L in preference to all the others, whereas most 
of the important ideas in the theory of linear spaces are independent of 
any specially chosen basis and are best treated, when this is possible, 
without reference to any basis whatever. A second reason is that almost 
all the linear spaces of greatest interest carry additional algebraic or 
topological structure, which need not be related in any significant manner 
to the above isomorphisms. 


Problems 


1. Let Z be a non-zero finite-dimensional linear space of dimension n. 
Show that every set of n + 1 vectors in L is linearly dependent. 
Show that a set of n vectors in L is a basis = it is linearly inde- 
pendent = it spans L. 

2. Show that the vectors (1, 0, 0), (1, 1, 0), (1, 1, 1) form a basis for 2%. 
Show that if {¢1, es, e3} isa basis for R’, then {e1 + e2, €1 + és, e2 + es} 
is also a basis. 

3. Let M be a subspace of a linear space L, and show that there exists a 
subspace N such that L = M @® WN. Give an example for the case 
in which L = R? to show that N need not be uniquely determined by 
M. 

4. If M and N are subspaces of a linear space L, and if L = M @N, 
show that the mapping y— y+ M which sends each y in N to 
y + M in L/M is an isomorphism of N onto L/M. 

5. Denote the dimension of a linear space L by d(Z). If L is finite- 
dimensional, and if M and WN are subspaces of L, prove the following: 
(a) d(M) < d(L), and d(M) = d(L) & M = L; 

(b) dM) +d(N) =d(M+N)+d(M ON); 
() fL=M+N,thnL=M @N#Sd(L) = d(M) +d(N); 
(d) d(L/M) = d(L) — d(M). 

6. If Zand L’ are linear spaces, show that ZL is isomorphic to L’= they 

have the same scalars and the same dimension. 
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44, LINEAR TRANSFORMATIONS 


Let L and L’ be linear spaces with the same system of scalars. A 
mapping T of L into L’ is called a linear transformation if 


Ta+y)=T(e)+Tty) and T(ax) = aT (2), 


or equivalently, if 


T(ax + By) = aT(x) + BT(y). 


A linear transformation of one linear space into another is thus a homo- 
morphism of the first space into the second, for it is a mapping which 
preserves the linear operations. TJ also preserves the origin and nega- 
tives, for 7(0) = T(0-0) = 0: T(0) = 0 and 


T(—2) = T((—1)z) = (—1)T(@) = —TQ). 


The importance of linear spaces lies mainly in the linear transforma- 
tions they carry, for vast tracts of algebra and analysis, when placed in 
their proper context, reduce to the study of linear transformations of one 
linear space into another. The theory of matrices, for instance, is one 
small corner of this subject, as are the theory of certain types of differen- 
tial and integral equations and the theory of integration in its most ele- 
gant modern form. 

In the following examples we leave it to the reader to show that each 
mapping described actually is a linear transformation. 


Example 1. We consider the linear space #?, and each linear transfor- 
mation mentioned is a mapping of R? into itself. 

(a) T1((a1,22)) = (a%1,a%2), where a is a real number. The effect 
of T, is to multiply each vector in R? by the scalar a. 

(b) T2((21,%2)) = (2,21). Ts reflects R? about the diagonal line 
1 = Xo. 

(c) T3((a1,22)) = (@1,0). 3 projects R? onto the 2; axis. 

(dq) T.((11,%2)) = (0,22). Ts projects R? onto the 2x2 axis. 


Example 2. Consider the linear space P of all polynomials p(x), with 
real coefficients, defined on [0,1]. The mapping D defined by 


_ 
D(p) = a 
is clearly a linear transformation of P into itself. 
Example 3. The mapping J defined by 
1 
If) = fy $@) de 


is easily seen to be a linear transformation of @[0,1] into the real linear 
space F of all real numbers. 
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We return to our consideration of the linear spaces LZ and L’ and of 
the linear transformations of ZL into L’. If T and U are two such 
transformations, then they can be added in a natural way to yield 7 + U, 
which is defined by 


(T + U)(z) = Ti) + UC). (1) 


Similarly, any such transformation J can be multiplied by any scalar a, 
in accordance with 


(aT) (x) = aT (a). (2) 


Simple computations show readily that T + U and aT are themselves 
linear transformations of L into L’, and it is easily proved that these 
definitions convert the set of all such linear transformations into a linear 
space. The zero transformation 0 (i.e., the zero element of this linear 
space) and the negative —T of a transformation 7 are defined by 
O(z) = 0 and (—T)(x) = —T(z). In summary, we have 


Theorem A. Let L and L’ be two linear spaces with the same system of 
scalars. Then the set of all linear transformations of L into L’ is itself a 
linear space with respect to the linear operations defined by Eqs. (1) and (2). 


The most interesting and significant applications of these ideas occur 
in the special cases in which (1) L’ equals L, and (2) L’ equals the linear 
space of all scalars of L. We now develop a few of the simpler concepts 
which arise in case (1). Case (2) will be treated in some detail in the 
next chapter. 

We assume, then, that we have a single linear space L, and we con- 
sider the linear space of all linear transformations of D into itself. We 
usually speak of these as linear transformations on L. The most impor- 
tant feature of this situation is that if T and U are any two linear trans- 
formations on L, then we can define their product TU by means of 


(TU)(z) = T(U(a)). (3) 


This is precisely the multiplication of mappings discussed at the end 
of Sec. 3, and Problem 3-1 assures us that this operation is associative: 


T(UV) = (TU)V. (4) 


Furthermore, multiplication is related to addition by the distributive 


laws 
TU + V) =TU+TV (5) 
and (T+ U)V =TV+ UY, (6) 


and to scalar multiplication by 
a(TU) = (aT)U = T(aU). (7) 
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The proofs of these facts are easy. As an illustration, we prove (6) by 
the following computation: 


(T + U)V\(e) = (T + U)(V(a)) 
= T(V(@)) + U(V(s)) 
= (TV)(z) + (UV)(z) 
= (TV + UV)(z). 


Examples can readily be found to show that multiplication is in general 
non-commutative. For instance, if we define a linear transformation M 
on the space P of polynomials p(x) by M(p) = xp, then 

dp 


(MD)(p) = M(D(p)) = 2D(p) = 22 


and (DM)(p) = D(M(p)) = Diep) = 2 2 + p, 


so MD ~# DM. Also, it is quite possible for the product of two non-zero 
linear transformations to be 0. Examples lc and 1d demonstrate this, 
for the transformations T; and 7, are both different from 0, and yet 
T3T4 —_ 0. 

We have so far seen only one specific linear transformation on the 
arbitrary linear space Z, namely, the zero transformation 0. Another 
is the identity transformation I, defined by I(x) = x. We observe that 
I 4#0@L #4 {0}, and that 


TI=1T=T (8) 


for every linear transformation T on L. If @ is any scalar, then the 
linear transformation aJ is called a scalar multiplication, for 


(al)(x) = al(x) = ax 


shows that the effect of aJ is to multiply each vector in L by a. 

A linear transformation T on L is called non-singular if it is one-to- 
one and onto, and singular otherwise. If T is non-singular, then by 
Sec. 3 its inverse T-! exists as a mapping and satisfies the following 
equation: 

TT! = TT = I. (9) 


It is not difficult to show that when T is non-singular, then the mapping 
T—' is also a linear transformation on L. 

A particularly important type of linear transformation on L arises 
as follows. Let ZL be the direct sum of the subspaces M and N, so that 
L=M@®N. This means, of course, that each vector z in L can be 
written uniquely in the form z = x + y with rin M and yin N. Since 
z is uniquely determined by z, we can define a mapping E£ of L into itself 
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by E(z) = x. Eis easily seen to be a linear transformation on L, and it 
is called the projection on M along N. Figure 33 indicates the geometric 
reason for this terminology. The most significant property of E is that 
it is idempotent, in the sense that E? = E; for since x = x + 0 is the 
representation of x as the sum of a 
vector in M and a vector in N, we 
have 


E*(z) = (EE)(z) = E(E@)) 
= E(x) = 2x = Ez). 


This property of idempotence is char- 
acteristic of projections, as our next 
theorem shows. 


Theorem B. If E ts a linear transfor- 

mation on a linear space L, then E is 

idempotent = there exist subspaces M 

Fig. 33. The projections E on M _ 

along N and I — Eon WN along M. and N of L such that L=M®N 
and E is the projection on M along N. 


proor. In view of the above remarks, it suffices to show that if E is 
idempotent, then it is the projection on M along N for suitable M and N. 
We define M and N by M = {E(z):z¢L} and N = {z:E(z) = 0}. 
Both are clearly subspaces, and we must show that L = M@N. By 
Theorem 42-B, it suffices to show that M and N span L and are disjoint. 
That M and N span L follows from the fact that each z in L can be 
written in the form 


z= E(z) + UI — E)@); (10) 
for E(z) is obviously in M, and 


E(I — E)(z)) = (EU — E))(z) = (E — B*)(2) = (FE — E)@) 
= 0(z) = 0 


shows that (I — E)(z) isin N. To see that M and N are disjoint, we 
have only to notice that if a vector E(z) in M is also in N, so that 
E(E(z)) = 0, then E(E(z)) = E*(z) = E(z) shows that E(z) = 0. This 
proves that L = M @ N, and it follows from Kq. (10) that F is precisely 
the projection on M along N. 


The unsymmetric way in which M and WN are treated in this discus- 
sion can easily be balanced by considering the mapping which makes 
correspond to each z = x + y the vector y (instead of x). This linear 
transformation (it is clearly J — E) is the projection on N along M. In 
the light of our theorem, we define a projection on L to be an idempotent 
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linear transformation on L. If £ is a linear transformation on L, then 
the equation 


(—E) = (I-E\I-E)=I1-E-E+E 


shows that E is a projection = J — £ is a projection; and we know that 
if EF is the projection on M along N, then J — E is the projection on N 
along M, and conversely. We make one further comment on these 
matters: if M is a given subspace of L, then by Problem 43-3 there cer- 
tainly exists a projection on M and along some N;; but since there may 
be many different subspaces N such that L = M @ N, there may also 
be many different projections on M (and along various N’s). 


Problems 


1. Show that the mappings defined by Eqs. (1) to (3) are linear 
transformations. 

2. Show that the linear transformations 72 and T; defined in Example 1 
do not commute; that is, show that 7273 # T3T>. 

3. If D and M are the linear transformations on the space P defined 
in the text, show that DM = MD + I and (MD)? = M?D? + MD. 

4. Let T bea linear transformation on a linear space L, and show that T 
is non-singular <> there exists a linear transformation J’ on L such 
that TT’ = 7'T = I. 

5. Let 7 be a linear transformation on a linear space L, and prove 
that T is non-singular = T(B) is a basis for L whenever B is. 

6. Prove that a linear transformation on a finite-dimensional linear 
space is non-singular © it is one-to-one © it is onto. 

7. Show that the set of all non-singular linear transformations on a 
linear space ZL is a group with respect to multiplication. If Z is 
finite-dimensional with dimension n > 0, this group is called the 
full linear group of degree n. 

8. If L and L’ are non-zero linear spaces (both real or both complex), 
prove that there exists a non-zero linear transformation of L into L’. 

9. Let L be a linear space, and let x and y be vectors in Z such that 
x #0. Prove that there exists a linear transformation T on L such 
that T(x) = y. If y is not a scalar multiple of z, prove that there 
exists a linear transformation 7’ on L such that 7’(z) = 0 and 
T'(y) € 0. 

10. Let Land L’ be linear spaces with the same scalars, and let T be a 
linear transformation of L into L’. The null space of T, namely, 
{x:T(x) = 0}, and its range, {T(x):2¢L}, are clearly subspaces 
of Land L’. The nullity of T, denoted by n(T), is the dimension 
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of its null space, and its rank r(T) is the dimension of its range. If 
Lis finite-dimensional, prove that n(T) + r(T) = d(L). 

11. If # is a projection on a linear space L, show that its range equals 
the set of all vectors which are fixed under E; i.e., show that 
{E(z):2¢ L} = {z:E(z) = 2}. 


45. ALGEBRAS 


A linear space A is called an algebra (see Sec. 20) if its vectors can 
be multiplied in such a way that A is also a ring in which scalar multipli- 
cation is related to multiplication by the following property: 


a(zy) = (ax)y = x(ay). 


The concept of an algebra is therefore a natural combination of the con- 
cepts of a linear space and a ring. Figure 34 illustrates the manner in 


Additive Abelian groups 


Linear 


Algebras spaces 


Fig. 34. The major algebraic systems. 


which the major algebraic systems defined in this chapter are related to 
one another. 

Since an algebra is a linear space, all the ideas developed in Secs. 42 
and 43 are immediately applicable. Some algebras are real and some 
are complex, and every algebra has a well-defined dimension. Further- 
more, since an algebra is also a ring, it may be commutative or non-com- 
mutative, and may or may not have an identity; and if it does have 
an identity, then we can speak of its regular and singular elements. A 
division algebra is an algebra with identity which, as a ring, is a division 
ring. <A subalgebra of an algebra A is a non-empty subset A, of A which 
is an algebra in its own right with respect to the operations in A. This 
condition evidently means that Ao is closed under addition, scalar multi- 
plication, and multiplication. 
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Example 1. (a) The real linear space F of all real numbers (see Exam- 
ple 42-1) is a commutative real algebra with identity if multiplication is 
defined in the ordinary way. The reader will observe that scalar 
multiplication is indistinguishable from ring multiplication in this system. 

(b) The complex linear space C of all complex numbers defined in 
Example 42-4 is a commutative complex algebra with identity if multi- 
plication is defined as usual. Again we see that scalar multiplication 
and ring multiplication are the same. 


Example 2. (a) ‘The real linear space C(X,R) (see Example 42-3) is a 
commutative real algebra with identity if multiplication is defined 
pointwise. 

(b) The complex linear space €(X,C) defined in Example 42-6 is a 
commutative complex algebra with identity with respect to pointwise 
multiplication. 


Example 3. Let L be a linear space. We know by Theorem 44-A that 
the set of all linear transformations on L is a linear space with respect 
to the linear operations defined by Eqs. 44-(1) and 44-(2). The discus- 
sion following this theorem can be summed up as follows. If multiplica- 
tion is defined by Eq. 44-(3), then this linear space is an algebra which 
is real or complex according as L is real or complex. This algebra has an 
identity (the identity transformation) = L ~ {0}, and in general it is 
non-commutative and has non-zero divisors of zero. 


An ideal I in an algebra A is a non-empty subset of A which is both a 
subspace when A is considered as a linear space and an ideal when A is 
considered as a ring. By Theorems 42-A and 41-B, A/J is both a linear 
space and aring. It is easy to see that A/T is actually an algebra, called 
the quotient algebra of A with respect to J. An ideal in A in our present 
sense is sometimes called an algebra ideal, as opposed to what we might 
call a ring ideal, that is, an ideal in A when A is considered as a ring. 
By our definition, an algebra ideal is a ring ideal which is also a subspace. 
In the cases of interest to us, the distinction between these two types of 
ideals disappears. For if A has an identity 1, and if J is a ring ideal in A, 
then the fact that i¢ I = at = a(lt) = (al)teZ for every scalar a 
shows that J is closed under scalar multiplication, and is therefore an 
algebra ideal. 

We shall return to the subject of algebras and their ideals in later 
chapters. 


Problems 


1. Let T be a non-singular linear transformation on a linear space L, 
and let A be the algebra of all linear transformations on L. Show 
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that T is non-singular (ie., regular) as an element of the algebra 
ASL # {0}. 

If A is an algebra, show that the subset of A defined by C = {x:zy 
= yx for every y& A} is a subalgebra of A. C is called the center 
of A (see Problem 39-6). 

Let A be an algebra of linear transformations on a linear space L. 
If A contains the identity transformation, prove that the center of A 
contains all scalar multiplications. If A is the algebra of all linear 
transformations on ZL, prove that the center of A consists precisely 
of the scalar multiplications (Hint: see Problem 44-9). 

Let A and A’ be algebras which are both real or both complex. As 
usual, we define a homomorphism of A into A’ to be a mapping f of A 
into A’ which preserves all the operations, in the sense that f(x + y) 
= f(z) + fy), flax) = af(x), and f(zy) = f(x)f(y). An isomorphism 
is a one-to-one homomorphism, and A is said to be isomorphic to A’ 
if there exists an isomorphism of A onto A’. Now let A be an 
arbitrary algebra with identity, and prove that the mapping f defined 
on A by f(z) = M., where M.(y) = zy, is an isomorphism of A into 
the algebra of all linear transformations on A. This fact is analogous 
to Cayley’s theorem (see Problem 39-11). The isomorphism f is 
called the regular representation of A (by linear transformations on 
itself). 


CHAPTER NINE 


Banach Spaces 


We have already seen, in Sec. 14, that a Banach space is a linear 
space which is also, in a special way, a complete metric space. This 
combination of algebraic and metric structures opens up the possibility 
of studying linear transformations of one Banach space into another 
which have the additional property of being continuous. 

Most of our work in this chapter centers around three fundamental 
theorems relating to continuous linear transformations. The Hahn- 
Banach theorem guarantees that a Banach space is richly supplied with 
continuous linear functionals, and makes possible an adequate theory of 
conjugate spaces. The open mapping theorem enables us to give a satis- 
factory description of the projections on a Banach space, and has the 
important closed graph theorem as one of its consequences. We use the 
uniform boundedness theorem in our discussion of the conjugate of an 
operator on a Banach space, and this in turn provides the setting for our 
treatment in the next chapter of the adjoint of an operator on a Hilbert 
space. 

Virtually all this theory had its origins in analysis. Our present 
interest, however, lies in the study of form and structure, not in explor- 
ing the many applications of these ideas to specific problems. This 
chapter is therefore strongly oriented toward the algebraic and topologi- 


cal aspects of the matters at hand. 
au 
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46. THE DEFINITION AND SOME EXAMPLES 


We begin by restating the definition of a Banach space. 

A normed linear space is a linear space N in which to each vector z 
there corresponds a real number, denoted by ||z|| and called the norm 
of x, in such a manner that 

(1) |Jz\| > 0, and |[zl| = Oz = 0; 

(2) [lz + yl| < lel + llyll; 

(3) llex|| = |e Iz]. 

The non-negative real number ||z|| is to be thought of as the length 
of the vector z. If we regard ||z\| as a real function defined on N, this 
function is called the norm on N. It is easy to verify that the normed 
linear space N is a metric space with respect to the metric d defined by 
d(z,y) = ||z — y||. A Banach space is a complete normed linear space. 
Our main interest in this chapter is in Banach spaces, but there are 
several points in the body of the theory at which it is convenient to have 
the basic definitions and some of the simpler facts formulated in terms 
of normed linear spaces. For this reason, and also to emphasize the 
role of completeness in theorems which require this assumption, we work 
in the more general context whenever possible. The reader will find that 
the deeper theorems, in which completeness hypotheses are necessary, 
often make essential use of Baire’s theorem. 

Several simple but important facts about a normed linear space are 
based on the following inequality: 


| Ilzll — llyll | < lle — yl. (1) 


To prove this, it suffices to prove that 


\lzll — Ilyll < lz — yl; (2) 


for it follows from (2) that we also have 


—(llz]| — lly) = lyll - llell < lly — 2] = 1-@ — y)Il = Ile — yf, 
which together with (2) yields (1). We now prove (2) by observing that 
xl] = |\(z — y) + yl < llz — y|| + llyl|. The main conclusion we draw 


from (1) is that the norm is a continuous function: 
tn —> £ = |lzql| > |}. 


This is clear from the fact that | ||za\| — ||z[| | < ||za — z||, since z, > x 
means that ||z, — x|| > 0. In the same vein, we can prove that addition 
and scalar multiplication are jointly continuous (see Problem 22-5), for 


Yn candyryS%tyrrty 
and OQ a and 2, T= Anka — ar. 
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These assertions follow from 


|| (tn + Yn) = (x + y) || = I (en = x) + (Yn = y)|I 
< |lz. — zl] + lly. — yl 
and 


llanty — atl] = |lon(tn — 2) + (an — a)ar|| 
Slan| lz, — xl] + len — a [le]. 


Our first theorem exhibits one of the most useful ways of forming 
new normed linear spaces out of old ones. 


Theorem A. Let M be a closed linear subspace of a normed linear space N. 
If the norm of a coset x + M in the quotient space N/M is defined by 


jz + M]] = inf {lz + m||:me M}, (3) 


then N/M is a normed linear space. Further, if N is a Banach space, 
then so is N/M. 
pRooF. We first verify that (3) defines a norm in the required sense. 
It is obvious that ||z + M||> 0; and since M is closed, it is easy to see 
that |lz + M|| = 0 there exists a sequence {m} in M such that 
lz + m,|| 70 2zisin Ma=2+ M = M = the zero element of N/M. 
Next, we have ||(x + M) + (y+ M)|| = ||(@ + y) + M|| = inf {lx + 
y + ml\|:me M} = inf {lz + y+ m-+m'||:m and m’¢ M} = inf {]|(z 
+m) + (y + m’)||:m and m’¢ M} < inf {lz + ml] + lly + m’||: mand 
m'e M} = inf {|lz + mll:meM} + inf {lly + m’||:m’ ¢ M} = |lz + 
M|| + lly + M||. The proof of |la(z + M)|| = |e |lz + M|| is similar. 
Finally, we assume that N is complete, and we show that N/M is 
also complete. If we start with a Cauchy sequence in N/M, then by 
Problem 12-2 it suffices to show that this sequence has a convergent sub- 
sequence. It is clearly possible to find a subsequence {z, + M} of the 
original Cauchy sequence such that ||(2i + M) — (2+ M)|| < 44, 
|| (r2 oF M) = (x3 Be M)|l < MY, and, in general, Ila + M) a (Tn41 + 
M)|| < 1/2". We prove that this sequence is convergent in N/M. We 
begin by choosing any vector y: in +; + M, and we select y2 in rz. + M 
such that |ly: — y2l| <4. We next select a vector y; in z3 + M such 
that |ly2 — ysl| < 4%. Continuing in this way, we obtain a sequence 
{yn} in N such that |lyn — ynyill < 1/2". If'm <n, then 


Ilan ad Ynl = (ym — Ym41) + (Ym41 — Ym42) Ft 
+ Yn-1 — yn) || < I] Y/m = Ym+ai| + |lymes — Ym+2l| ska 
+ [lyn — yall < 1/2" + 1/2948 + 1/21 < 1/2", 


so {yn} is a Cauchy sequence in N. Since N is complete, there exists a 
vector y in N such that y,— y. It now follows from ||(z, + M) — 
(y+ M)|| < lly. — yl] that z, + M—y-+ M,so N/M is complete. 
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In the following sections and chapters, we shall often have occasion 
to consider the quotient space of a normed linear space with respect to a 
closed linear subspace. In accordance with our theorem, a quotient 
space of this kind can always be regarded as a normed linear space in its 
own right. 

We now describe some of the main examples of Banach spaces. 
In each of these, the linear operations are understood to be defined 
either coordinatewise or pointwise, whichever is appropriate in the 
circumstances. 


Example 1. The spaces R and C—the real numbers and the complex 
numbers—are the simplest of all normed linear spaces. The norm of a 
number z is of course defined by ||z]] = |z], and each space is a Banach 
space. 


Example 2. The linear spaces R" and C™ of all n-tuples x = (x1, 22, 
. » Zn) of real and complex numbers can be made into normed linear 
spaces in an infinite variety of ways, as we shall see below. If the 


norm is defined by 
lll = (da) (4) 


then we get the n-dimensional Euclidean and unitary spaces familiar to 
us from our earlier work. We denoted these spaces by R* and C” in 
Part 1 of this book, and we know by the theorems of Sec. 15 that both are 
Banach spaces. 


Each of the following examples consists of n-tuples of scalars, 
sequences of scalars, or scalar-valued functions defined on some non- 
empty set, where the scalars are the real numbers or the complex numbers. 
We do not normally specify which system of scalars is to be used, and it 
should be emphasized that both possibilities are allowed unless the 
contrary is clearly stated. Also, we make no distinction in notation 
between the real case and the complex case. When it turns out to be 
necessary to distinguish these two cases, we do so verbally, by referring, 
for instance, to ‘‘the complex space —.’’ These conventions are in 
accord with the standard usage preferred by most mathematicians, 
and they enable us to avoid a good deal of cumbersome notation and 
many unnecessary case distinctions. 


Example 3. Let p be a real number such that | < p< ©. We denote 
by 15 the space of all n-tuples z = (a1, 22, . . . , tn) of scalars, with the 
norm defined by 


Nelle = CQ bale)". (5) 
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Formula (4) is obviously the special case of (5) which corresponds to 
p = 2,80 the real and complex spaces 1} are the n-dimensional Euclidean 
and unitary spaces R* and C*. It is easy to see that (5) satisfies condi- 
tions (1) and (3) required by the definition of a norm. In Problem 4 we 
outline a proof of the fact that (5) also satisfies condition (2), that is, 
that |lz + yll, < Ilzllp + llyllp. The completeness of [* follows from 
substantially the same reasoning as that used in the proof of Theorem 
15-A, so [fis a Banach space. 


Example 4. We again consider a real number p with the property that 
1 < p < ~, and we denote by I, the space of all sequences 


w= {x1, 22, ...,2n,..-.-} 


of scalars such that 27_; |r.|? < ©, with the norm defined by 


lel = CO teal)". 6) 


The reader will observe that the real and complex spaces l2 are precisely 
the infinite-dimensional Euclidean and unitary spaces R” and C@ defined 
in Problem 15-4. The proof of the fact that 1, actually is a Banach 
space requires arguments similar to those used in Problems 15-3 and 15-4. 


The Banach spaces discussed in these examples are all special cases of 
the important L, spaces studied in the theory of measure and integration. 
A detailed treatment of these spaces is outside the scope of this book, but 
we can describe them loosely as follows. An Z, space essentially consists 
of all measurable functions f defined on a measure space X with measure 
m which are such that | f(x)|? is integrable, with 


fle = ([1S@)|? dm(zx))¥” (7) 


taken as the norm. In order to include the spaces /} and I, within the 
theory of L, spaces, we have only to consider the sets {1, 2, ...,n} 
and {1, 2,...,m,.. .} as measure spaces in which each point has 
measure 1, and to regard n-tuples and sequences of scalars as functions 
defined on these sets. Since integration is a generalized type of summa- 
tion, formulas (5) and (6) are special cases of formula (7).! 


Example 5. Just as in Example 3, we start with the linear space of all 
n-tuples x = (a1, %2, . . . , Zn) Of scalars, but this time we define the 


1 Several remarks and examples relating to Lp spaces are scattered about in this 
and the next chapter. This fragmentary material is not essential for an understand- 
ing of these chapters, and may be disregarded by any reader without the necessary 
background. Brief sketches of the relevant ideas can be found in Taylor [41, chap. 7] 
and Loomis {27, chap. 3]. For more extended treatments, see Halmos [18], Zaanen 
[45], or Kolmogorov and Fomin [26, vol. 2]. 
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norm by 
lll] = max {lel, [ze], . . . , [al}. (8) 
This Banach space is commonly denoted by /%, and the symbol ||z||.. is 


occasionally used for the norm given by (8). The reason for this practice 
lies in the interesting fact that 


[zl] o = lim ||z, asp o, 
that is, that 
max {|2,|} = lim (> |x?) 1? as p—> ©. (9) 
t=1 


We briefly inspect the case n = 2 to see why thisistrue. Letz = (x, 22) 
be an ordered pair of real numbers with z; and 2, > 0. It is clear 


that |lz|}. = max {21, 22} < (21? + x2”) ¥? = |lzll,. If 21 = 22, then 
lim ||z||p = lim (222?)"? = lim 21/?4,. = ze = ||z\lo. And if 2; < ze, then 
lim ||x||, = lim (a1? + 22”)/? = lim ([(a21/22)? + 1]x2”)"? 
= lim [(x1/22)” + 1]/?2_ = a2 = |[z|l... 
Example 6. Consider the linear space of all bounded sequences x = {21, 
Xo, ...4,%n,...} of scalars. By analogy with Example 5, we define 
the norm by 
\|z|| = sup |z,1, (10) 


and we denote the resulting Banach space by I... The set c of all con- 
vergent sequences is easily seen to be a closed linear subspace of J.. and 
is therefore itself a Banach space. Another Banach space in this family 
is the subset co of c which consists of all convergent sequences with 
limit 0. 

Example 7. The Banach space of primary interest to us is the space 
@(X) of all bounded continuous scalar-valued functions defined on a 
topological space X, with the norm given by 


fll = sup |f(@)|." (11) 
This norm is sometimes called the uniform norm, because the statement 
that f, converges to f with respect to this norm means that f, converges 
to f uniformly on X. The fact that this space is complete amounts to the 
fact that if f is the uniform limit of a sequence of bounded continuous 
functions, then f itself is bounded and continuous. If, as above, we 
consider n-tuples and sequences as functions defined on {1, 2, ... , n} 
and {1,2,...,n,.. .}, then the spaces /% and l., are the special cases 
of @(X) which correspond to choosing X to be the sets just mentioned, 
each with the discrete topology. 


1The real space €(X) and the complex space @(X) are, of course, the spaces 
previously denoted by €(X,R) and e(X,C). 
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Many important properties of a Banach space are closely linked to 
the shape of its closed unit sphere, that is, the set S = {zx:|lz|] < 1}. 
One basic property of S is that it is always conver, in the sense (see 
Problem 32-5) that if x and y are any two vectors in S, then the vector 
z = az + fy is also in S, where a and § are non-negative real numbers 
such that a + 8 = 1; for |lz|| = |lar + By|| < el|z|| + Blly]| < a+ B= 1. 
In this connection, it is illuminating to consider the shape of S for certain 
simple examples. Let our underlying linear space be the real linear 


Fig. 35. Some closed unit spheres. 


space #? of all ordered pairs x = (x,%2) of real numbers. As we have 
seen, there are many different norms which can be defined on R?, among 
which are the following: [|x|], = |x: + |zel; |lzlle = (lai|? + |x2]?)%; and 
l[l|.o = max {|21|, |ze|}. Figure 35 illustrates the closed unit sphere 
which corresponds to each of these norms. In the first case, S is the 
square with vertices (1,0), (0,1), (—1,0), (0,—1); in the second, it is the 
circular disc of radius 1; and in the third, it is the square with vertices 
(1,1), (-1,)), (—1,—1), (,—-1). If we consider the norm defined by 


llzllp = (laal? + [x2]?)"*, (12) 


where 1 < p < ~, and if we allow p to increase from 1 to », then the 
corresponding S’s swell continuously from the first square mentioned to 
the second. We note that S is truly “spherical” «=p = 2. These 
considerations also show quite clearly why we always assume that 
p = 1; for if we were to define ||z||, by formula (12) with p < 1, then 
S = {2:|l|z|l, < 1} would not be convex (see the star-shaped inner portion 
of Fig. 35). For p < 1, therefore, formula (12) does not yield a norm. 
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In the above examples, we have exhibited several different types of 
Banach spaces, and there are yet others which we have not mentioned. 
Amid this diversity of possibilities, it is well to realize that any Banach 
space can be regarded—from the point of view of its linear and norm 
structures alone—as a closed linear subspace of C(X) for a suitable 
compact Hausdorff space X. We prove this below, in our discussion 
of the natural imbedding of a Banach space in its second conjugate space. 


Problems 


1. Let N be a non-zero normed linear space, and prove that N is a 
Banach space —{z:||z|| = 1} is complete. 
2. Leta Banach space B be the direct sum of the linear subspaces M and 
.N,sothatB = MON. Ifz=2-+ y is the unique expression of a 
vector z in B as the sum of vectors x and y in M and N, then a new 
norm can be defined on the linear space B by |lz||’ = ||z\] + |lyll. 
Prove that this actually isa norm. If B’ symbolizes the linear space 
B equipped with this new norm, prove that B’ is a Banach space if 
M and N are closed in B. 

3. Prove Eq. (9) for the case of an arbitrary positive integer n. 
4, In this problem we sketch the proofs—and we ask the reader to fill 
in the details—of some important inequalities relating to n-tuples 
x = (1, 02, ...,%n) and y = (yi, y2, - . - , Yn) Of scalars. When- 
ever p occurs alone, and nothing is said to the contrary, we assume 
that 1 < p < ©; and whenever p and g occur together, we assume 

that both are greater than 1 and that 1/p + 1/q = 1. 
(a) Show that a and b > 0= a/b < a/p + b/g. (If a= 0 or 
b = 0, the conclusion is clear, so assume that both are positive. 
If ke (0,1), define f(¢) fort > 1 by ft) = kG —1) —# 41. 
Note that f(1) = 0 and f’(t) > 0, and conclude that t < kt 
+(1—k). Ifa>6, put t= a/b and k=1/p; if a<b, 
put ¢ = b/a and k = 1/q; and in each case, draw the required 
conclusion.) 

(b) Prove Hélder’s inequality: 2%, |zy:| < |lzllpllyll. (fx = O0or 
y = 0, the inequality is obvious, so assume that both are ¥ 0. 
Put a; = (|zi|/||zl])? and bs = (ly:l/llylla)*, and use part (a) 
to obtain |z:y:{/|l2z\lpllylle < a:/p + 6:/g. Add these inequali- 


ties fori = 1, 2, . . . , n, and conclude that 
(> lzwl)/llzllollylle < 1/p + 1/g = 1.) 
i=1 


(c) Prove Minkowski’s inequality: ||z + yl|p < |Izllp + llyllp. (The 


inequality is evident when p= 1, so assume that p> l. 
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Use Hélder’s inequality to obtain 


lla + ylle” = > ! xy + ysl? = > | ry + ys] Jax + y:|P-} 
t=1 a= 
Jal Jes + ysf?-? + > ysl [2s + ysl? 
t= 
< ([lzllp + Ilylly) lz + ylle?’*.) 


When p = q = 2, Hélder’s inequality becomes Cauchy’s 
inequality as stated and proved in Sec. 15. The Hélder and 
Minkowski inequalities can easily be extended from finite sums 
to series. For readers with some knowledge of the theory of 
measure and integration, we remark that these inequalities can 
also be stated in the following much more general] forms: if f is 
in L, and g is in L,, then their pointwise product fg is in L; and 


IIfglli < Wllollalles 
and if f and g are both in L,, then f + g is also in L, and 


If + gllp S Wlflle + liglle- 


It is to be understood that f and g are measurable functions 
defined on an arbitrary measure space and that the norms 
occurring in these inequalities are those defined by formula (7). 


47. CONTINUOUS LINEAR TRANSFORMATIONS 


Let N and N’ be normed linear spaces with the same scalars, and 
let T be a linear transformation of N into N’.1 When we say that T is 
continuous, we mean that it is continuous as a mapping of the metric 
space WN into the metric space N’. By Theorem 13-B, this amounts to the 
condition that z,— 2 in N = T(2,) > T(x) in N’. Our main purpose 
in this section is to convert the requirement of continuity into several 
more useful equivalent forms and to show that the set of all continuous 
linear transformations of N into N’ can itself be made inte a normed linear 
space in a natural way. 


Theorem A. Let N and N’ be normed linear spaces and T a linear trans- 
formation of N into N’. Then the following conditions on T are all equiva- 
lent to one another: 

(1) T ts continuous; 

(2) Tiscontinuous at the origin, in the sense that x, + 0 = T(x,) > 0; 

1In the future, whenever we mention two normed linear spaces with a view to 


considering linear transformations of one into the other, we shall always assume— 
without necessarily saying so explicitly—that they have the same scalars. 
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(3) there exists a real number K > 0 with the property that ||T(x)]] 
< K\lz|| for every xe N; 

(4) if S = {x:|l2|]| <1} ds the closed unit sphere in N, then its 
image T(S) is a bounded set in N’. 
proor. (1) = (2). If T is continuous, then since 7(0) = 0 it is cer- 
tainly continuous at the origin. On the other hand, if 7 is continuous at 
the origin, then z, > & @ 2, — 2-0 = T(t, — 2) 90 @ T(a,) — T(z) 
—0<T(2z,) — T(x), so T is continuous. 

(2) = (3). It is obvious that (3) = (2), for if such a K exists, then 
%n— 0 clearly implies that T(z,)—> 0. To show that (2) = (3), we 
assume that there is no such K. It follows from this that for each 
positive integer n we can find a vector z, such that ||7(z,)|| > n|lz,ll, or 
equivalently, such that ||7'(z,/n||z,||)|| > 1. If we now put 


Yn = 2,/n\|znll, 


then it is easy to see that y, > 0 but T(y,) 40, so T is not continuous 
at the origin. 

(3) = (4). Since a non-empty subset of a normed linear space is 
bounded © it is contained in a closed sphere centered on the origin, it is 
evident that (3) = (4); for if ||z|| < 1, then ||7(x)|] < K. To show that 
(4) = (3), we assume that 7'(S) is contained in a closed sphere of radius 
K centered on the origin. If x = 0, then T(x) = 0, and clearly ||7(z)|| 
< K||z||; and if z 0, then z/||z|] ¢ S, and therefore ||7(z/|lz||)|| < K, 
so again we have ||T(zx)|| < K|lz|l. 


If the linear transformation T in this theorem satisfies condition (3), 
so that there exists a real number K > 0 with the property that 


|T(z)|l < Kllzll 


for every x, then K is called a bound for T, and such a T is often referred 
to as a bounded linear transformation. According to our theorem, T is 
bounded + it is continuous, so these two adjectives can be used inter- 
changeably. We now assume that T is continuous, so that it satisfies 
condition (4), and we define its norm by 


|7|| = sup {| 7(@)|]:llzl| < 1}. (1) 
When N = {0}, this formula can clearly be written in the equivalent form 
|Z || = sup {[|7@)|]:|lzll = 1}. (2) 


It is apparent from the proof of Theorem A that the set of all bounds for 
T equals the set of all radii of closed spheres centered on the origin which 
contain 7(S). This yields yet another expression for the norm of T, 
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namely, 
||7|| = inf {K:K > 0 and ||T(z)|| < K\lz|l for all x}; (3) 
and from this we see at once that 


I7@)Il < WTI Mell (4) 
for all x. 

We now denote the set of all continuous (or bounded) linear trans- 
formations of N into N’ by @(N,N’), where the letter ‘“@’”’ is intended to 
suggest the adjective ‘‘bounded.”’ It is a routine matter to verify that 
this set is a linear space with respect to the pointwise linear operations 
defined by Eqs. 44-(1) and 44-(2) and to show that formula (1) actually 
does define a norm on this linear space. We summarize and extend these 
remarks in 


Theorem B. Jf N and N’ are normed linear spaces, then the set @(N,N’) 
of all continuous linear transformations of N into N’ is itself a normed 
linear space with respect to the pointwise linear operations and the norm 
defined by (1). Further, if N’ is a Banach space, then @(N,N’) is also a 
Banach space. 

PROOF. We leave to the reader the simple task of showing that @(N,N’) 
is a normed linear space, and we prove that this space is complete when 
N’ is. 

Let {7,} be a Cauchy sequence in @(N,N’). If x is an arbitrary 
vector in N, then ||7n(z) — Tn(z)|] = [\(7m — T.)(2)|| < |7m — Tall [lal 
shows that {7,(x)} is a Cauchy sequence in N’; and since N’ is complete, 
there exists a vector in N’-—we denote it by T(z)—such that 7,(x) > 
T(z). This defines a mapping 7 of N into N’, and by the joint con- 
tinuity of addition and scalar multiplication, T is easily seen to be a linear 
transformation. To conclude the proof, we have only to show that T is 
continuous and that 7,— T with respect to the norm on @(N,N’). By 
the inequality 46-(1), the norms of the terms of a Cauchy sequence in a 
normed linear space form a bounded set of numbers, so 


IZ (|| = [lim 7,(x)|] = lim ||7,.(z)|] < sup (I|7all llell) = (sup || Pall) [lal 
shows that 7 has a bound and is therefore continuous. It remains to be 
proved that ||7, — 7|| 0. Let « > 0 be given, and let mo be a positive 
integer such that m, n > m= ||Tn — Tall <« If ||z|| <1 and m, 
n > No, then 
|Pn(z) — Tr(x)|| = [\(Tm — Tn)(z)|] < [Pu — Tall lel 

< ||Tn — Tall <e. 


We now hold m fixed and allow n to approach , and we see that 
1Tm(z) — T.(z)|| > ||Tn(z) — T(z)||, from which we conclude that 
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|n(z) — T(z)|| < ¢ for all m > no and all z such that ||x|| <1. This 
shows that ||7. — 7'|| < ¢ for all m > no, and the proof is complete. 


Let N be a normed linear space. We call a continuous linear 
transformation of N into itself an operator on N, and we denote the 
normed linear space of all operators on N by @(N) instead of @(N,N). 
Theorem B shows that @() is a Banach space when N is. Furthermore, 
if operators are multiplied in accordance with formula 44-(3), then 
@(N) is an algebra in which multiplication is related to the norm by 


77] < Ph 7"I. (5) 
This relation is proved by the following computation: 


|P7"|| = sup {||(77")(z)I|:lxl] < 1} = sup {|| 7(7"(z))|}:[Iz|| < 1} 
< sup {|7l] 7’@){l:lzl| < 1} = WZ] sup {||7’@|]: la] i iF) 


We know from the previous section that addition and scalar multipli- 
cation in @(N) are jointly continuous, as they are in any normed linear 
space. Property (5) permits us to conclude that multiplication is also 
jointly continuous: 


T,— Tand T.— T'>T7,T.—> TT". 
This follows at once from 


7.7, — TT’ = \|T.(T. — 7’) + (Ta — T)T"|| S< Tall |Z. — T'l 
+ 7. — Tl IP". 


We also remark that when N # {0}, then the identity transformation J 
is an identity for the algebra @(N). In this case, we clearly have 


il = 1; (6) 


for ||Z|| = sup {||Z(x)|]:|[z]] < 1} = sup {|lz|]:]]zl] < 1} = 1. 

We complete this section with some definitions which will often be 
useful in our later work. Let N and N’ be normed linear spaces. An 
tsometric isomorphism of N into N’ is a one-to-one linear transformation 
T of N into N’ such that ||7(x)|| = ||z|] for every x in NV; and N is said to 
be zsometrically isomorphic to N’ if there exists an isometric isomorphism 
of N onto N’. This terminology enables us to give precise meaning to 
the statement that one normed linear space is essentially the same as 
another. 


Problems 


1. If M is a closed linear subspace of a normed linear space N, and if 
T is the natural mapping of N onto N/M defined by T(z) = x + M, 
show that 7 is a continuous linear transformation for which ||T|| < } 
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If T is a continuous linear transformation of a normed linear space 
N into a normed linear space N’, and if © is its null space, show that 
T induces a natural linear transformation T’ of N/M into N’ and 
that ||7"|| = |I7'l. 

Let N and N’ be normed linear spaces with the same scalars. If 
N is infinite-dimensional and N’ ¥ {0}, show that there exists a 
linear transformation of N into N’ which is not continuous. (We 
shall see in Problem 7 that if N is finite-dimensional, then every 
linear transformation of N into N’ is automatically continuous.) 

Let a linear space L be made into a normed linear space in two ways, 
and let the two norms of a vector x be denoted by ||z|| and |{z||’. 
These norms are said to be equivalent if they generate the same topol- 
ogy on L. Show that this is the case = there exist two positive real 
numbers K, and Ke such that K,|lz|| < |lz]|’ < Kellz|| for all z. 
(If L is finite-dimensional, then any two norms defined on it are 
equivalent. See Problem 7.) 

If n is a fixed positive integer, the spaces [5 (1 < p < ©) consist 
of a single underlying linear space with different norms defined 
on it. Show that these norms are all equivalent to one another. 
(Hint: show that convergence with respect to each norm amounts to 
coordinatewise convergence.) 

If N is an arbitrary normed linear space, show that any linear trans- 
formation T of 1? (1<p< ©) into N is continuous. (Hint: if 
{e1, €2, . . . , én} is the natural basis for [5, where e, is the n-tuple 
with 1 in the 7th place and 0’s elsewhere, then an arbitrary vector x 
in [% can be written uniquely in the form 


X= aye, + aree + ame ames + nen, 


and from this we get T(x) = aiT(e:1) + a2T(e2) + + + + anT (en); 
now apply the hint given for Problem 5.) 

Let N be a finite-dimensional normed linear space with dimension 
n> 0, and let {e:, e2,... , én} be a basis for N. Each vector x 


in N can be written uniquely in the form 
X= aye; + ane + ++ + anen. 


If T is the one-to-one linear transformation of N onto [} defined by 

T(x) = (a1, a2, . . . , Qn), then 7-! is continuous by Problem 6. 

(a) Prove that 7 is continuous. (Hint: if T is not continuous, 
then for some e > 0 there exists a sequence {y,} in N such that 
yn— 0 and ||T(yn)|| > €; if 2, = yn/l|T(yn)||, then 2, 0 and 
||7'(zn)|| = 1; the subset of J} consisting of all vectors of norm 1 
is compact, so {7'(z,)} has a subsequence which converges to a 
vector with norm 1; now use the continuity of T~!.) 
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(b) Show that every linear transformation of N into an arbitrary 
normed linear space N’ is continuous. 

(c) Show that any other norm defined on N is equivalent to the 
given norm. 

(d) Show that N is complete, and infer from this that every finite- 
dimensional linear subspace of an arbitrary normed linear space 
is closed. 

8. It is a simple consequence of Problem 7 that every finite-dimensional 
normed linear space is locally compact. Prove the converse, that is, 
that a locally compact normed linear space N is finite-dimensional. 
Hint: the closed unit sphere S of N is compact, so there is a finite 
subset of S, say {x1, 22, . . . , Xn}, with the property that each point 
of S is distant by less than 14 from one of the z,’s; let M be the 
linear subspace of N spanned by the 2,’s; and show that M = N 
(to do this, assume that there exists a vector y not in M, use the fact 
that M is closed to infer that d = d(y,M) > 0, find mp in M such 
that d < ||y — mol] < 3d/2, and deduce the contradiction that the 
vector yo in S defined by yo = (y — mo)/||y — moll is distant from M 
by at least 24). 


48. THE HAHN-BANACH THEOREM 


One of the basic principles of strategy in the study of an abstract 
mathematical system can be stated as follows: consider the set of all 
structure-preserving mappings of that system into the simplest system 
of the same type. This principle is richly fruitful in the structure theory 
(or representation theory) of groups, rings, and algebras, and we shall see 
in the next section how it works for normed linear spaces. 

We have remarked that the spaces R and C are the simplest of all 
normed linear spaces. If N is an arbitrary normed linear space, the 
above principle leads us to form the set of all continuous linear trans- 
formations of N into F or C, according as N is real or complex. This set— 
it is @(V,R) or B(N,C)—is denoted by N* and is called the conjugate 
space of N. The elements of N* are called continuous linear functionals, 
or more briefly, functionals.! It follows from our work in the previous 
section that if these functionals are added and multiplied by scalars 


1The noun “functional” seems to have originated in the theory of integral 
equations, It was used to distinguish between a function in the elementary sense 
defined on a set of numbers and a function (or functional) defined on a set of functions. 
In this book, we always use the word to mean a scalar-valued continuous linear func- 
tion defined on a normed linear space. 
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pointwise, and if the norm of a functional f is defined by 


fll = sup {]f(z)|:[lzll < 1} 
inf {K:K > Oand |f(z)| < K|lz|| for all x}, 


then N* is a Banach space. 

When we consider various specific Banach spaces, the problem arises 
of determining the concrete nature of the functionals associated with 
these spaces. It is not our aim in this section to explore the ample body 
of theory which centers around this problem, and in any case, the machin- 
ery necessary for such an enterprise (mostly the theory of measure and 
integration) is not available to us. Nevertheless, for the reader who may 
have the required background, we mention some of the main facts with- 
out proof. 

Let X be a measure space with measure m, and let p be a given real 
number such that 1 < p < ». Consider the Banach space L, of all 
measurable functions f defined on X for which |f(zx)|? is integrable. If 
g is a function in L,, where 1/p + 1/q = 1, we define a function F, on 
L, by 

F,(f) = f f(2)g(2) dm(z). 
The Hélder inequality for integrals mentioned at the end of Problem 
46-4 shows that 
IFo(f)| = |f f@)g@) dm) 
< f \f@9(@)| am(a) 
< |Ifllollglle. 


We conclude from this that F, is a well-defined scalar-valued linear func- 
tion on L, with the property that ||F,|| < |lg||,, and is therefore a func- 
tional on L,. It can be shown that equality holds here, so that 


IF oll = llglle- 


It can also be shown that every functional on Z, arises in this way, so the 
mapping g — F, (which is clearly linear) is an isometric isomorphism of 
L, onto L,*. This statement is usually expressed by writing 


L,* = Ly (1) 


where the equality sign is to be interpreted in the sense just explained. 
If we specialize these considerations to n-tuples of scalars, we see 
that (1) becomes 


ant = 0. (2) 


226 Operators 


Further, it can be shown that 

()* =U (3) 
and that dz)t =f. (4) 
We sketch proofs of (2), (3), and (4) in the problems. When we consider 


sequences of scalars, then for | < p < © we have the following special 
case of (1): 


L,* = 1. (5) 
If p = 1, we obtain a natural extension of (3): 
1,* = be (6) 


The corresponding extension of (4) is another matter, for it is false that 
l.* = 1,. Instead, we have 


cot = |. (7) 


What is J..*? We saw in Sec. 46 that I.. is a special case of C(X), so this 
question leads naturally to the problem of determining the nature of the 
conjugate space C*(X). The classic solution of this problem for a space 
X which is compact Hausdorff (or even normal) is known as the Riesz 
representation theorem, and it depends on some of the deeper parts of the 
theory of measure and integration (see Dunford and Schwartz [8, pp. 
261-265]). The situation is somewhat simpler for the case in which X is 
an interval [a,b] on the real line, but even here an adequate treatment 
requires a knowledge of Stieltjes integrals (see Riesz and Sz.-Nagy 
[35, secs. 49-51)). 

Most of the theory of conjugate spaces rests on the Hahn-Banach 
theorem, which asserts that any functional defined on a linear subspace of 
a normed linear space can be extended linearly and continuously to the 
whole space without increasing its norm. The proof is rather compli- 
cated, so we begin with a lemma which serves to isolate its most difficult 
parts. 


Lemma. Let M be a linear subspace of a normed linear space N, and let 
f be a functional defined on M. If xo is a vector not in M, and if 


My, =M + [xo] 


ts the linear subspace spanned by M and xo, then f can be extended to a 
functional fy defined on My such that || fol] = ||f\l. 

PROOF. We first prove the lemma under the assumption that N is a 
real normed linear space. We may assume, without loss of generality, 
that ||f|| = 1. Since zp is not in M, each vector y in Mo is uniquely 
expressible in the form y = x + az with xin M. It is clear that the 
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definition fo(x + aro) = fo(x) + afo(zo) = f(x) + oro extends f linearly 
to M, for every choice of the real number ro = fo(zo). Since we are 
trying to arrange matters so that ||fo|| = 1, our problem is to show that 
ro can be chosen in such a way that |fo(z + axo)| < |]z + axoll for 
every x in M and every a #0. Since fo(z + azo) = f(x) + aro, this 
inequality can be written as 


—||z + aaol| < f(z) + are < |x + azo 
or —f(x) — |lz + axell < oro < —f(z) + |lz + azo, 


which in turn is equivalent to 
x 

“(2 

We now observe that for any two vectors 7; and x2 in M we have 


S(a2) — f(a) = fez — a1) < | fee — 21)| < Ifll Ila2 — all 
= |lz2 — 2all = |\(v2 + 20) — (a1 + 20) |] < lle + xol] + flea + all, 
so —f(a1) — |lzi + xoll < —f(ee) + |lz2 + zl]. (9) 


; (8) 


<n<-s(2)+ 


= + 6] 
a 


zx 
= + Xo 
a 


If we define two real numbers a and b by 


a = sup {—f(x) — lz + zoll:x € M} 
and b = inf {—f(x) + lz + 2oll:2€ M}, 


then (9) shows that a < 6. If we now choose 7p to be any real number 
such that a < ro < 6, then the required inequality (8) is satisfied and this 
part of the proof is complete. 

We next use the result of the above paragraph to prove the lemma 
for the case in which N is complex. Here f is a complex-valued func- 
tional defined on M for which ||f|] = 1. We begin by remarking that a 
complex linear space can be regarded as a real linear space by simply 
restricting the scalars to be real numbers. If g and h are the real and 
imaginary parts of f, so that f(z) = g(x) + ih(x) for every z in M, then 
both g and h are easily seen to be real-valued functionals on the real 
space M; and since |[f|| = 1, we have ||g|]| < 1. The equation 


Siz) = if(), 
together with f(iz) = g(ix) + th(iz) and 
af(z) = i(g(z) + th(x)) = ig(z) — h(a), 


shows that h(x) = —g(iz), so we can write f(x) = g(x) — ig(iz). By 
the above paragraph, we can extend g to a real-valued functional go on 
the real space My in such a way that |lgo|| = ||g||, and we define fo for 
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xin My by fo(z) = gu(x) — tgo(tx). Itis easy to see that f, is an extension 
of ffrom M to Mo, that fo(« + y) = fo(x) + fo(y), and that fo(ar) = afo(z) 
for all real a’s. The fact that the property last stated is also valid for 
all complex a’s is a direct consequence of 


fo(ix) = go(tx) — igo(t?x) = go(tx) + tgo(x) = t(go(x) — igo(tx)) = fo(z), 


so fo is linear as a complex-valued function defined on the complex space 
Mp». All that remains to be proved is that || fol| = 1, and we dispose of 
this by showing that if z is a vector in M, for which ||z|| = 1, then 
\fo(z)| <1. If fo(x) is real, this follows from f(x) = go(x) and ||gol| < 1. 
If fo(x) is complex, then we can write fo(z) = re‘® with r > 0, so 


|fo(x)| = r = e~*fo(x) = fo(e*x); 


and our conclusion now follows from ||e—‘%z|| = ||z|| = 1 and the fact 
that fo(e—z) is real. 


Theorem A (the Hahn-Banach Theorem). Let M be a linear subspace of 
a normed linear space N, and let f be a functional defined on M. Then f 
can be extended to a functional fo defined on the whole space N such that 
IIfoll = IIll- 

prooF. The set of all extensions of f to functionals g with the same 
norm defined on subspaces which contain M is clearly a partially ordered 
set with respect to the following relation: g: < gz means that the domain 
of g: is contained in the domain of ge, and g2(x) = g:(zx) for all x in the 
domain of g:. It is easy to see that the union of any chain of extensions 
is also an extension and is therefore an upper bound for the chain. 
Zorn’s lemma now implies that there exists a maximal extension fy. We 
complete the proof by observing that the domain of f) must be the entire 
space N, for otherwise it could be extended further by our lemma and 
would not be maximal. 


As we stated in the introduction to this chapter, the main force of 
the Hahn-Banach theorem lies in the guarantee it provides that any 
Banach space (or normed linear space) has a rich supply of functionals. 
This property is to be understood in the sense of the following two 
theorems, on which most of its applications depend. 


Theorem B. If N is a normed linear space and xo is a non-zero vector in N, 
then there exists a functional fy in N* such that fo(xo) = ||zo|| and ||fol| = 1. 
PROOF. Let M = {az} be the linear subspace of N spanned by 2, 
and define f on M by f(axo) = al|zol|. It is clear that f is a functional on 
M such that f(xo) = ||zo|| and ||f|| = 1. By the Hahn-Banach theorem, 
f can be extended to a functional fy in N* with the required properties. 
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Among other things, this result shows that N* separates the vectors 
in N, for if 2 and y are any two distinct vectors, so that x — y + 0, then 
there exists a functional f in N* such that f(z — y) = 0, or equivalently, 
f(x) # fy). 


Theorem C. Jf M is a closed linear subspace of a normed linear space N 
and xo is a vector not in M, then there exists a functional fy in N* such that 
fo(M) = 0 and So(x0) ¥ 0. 

proor. The natural mapping T of N onto N/M (see Problem 47-1) is a 
continuous linear transformation such that T(M) = 0 and 


T(to) =a +M #0. 
By Theorem B, there exists a functional f in (V/M)* such that 
S(to + M) #0. 


If we now define fy by fo(x) = f(T(x)), then fy is easily seen to have the 
desired properties. 


These theorems play a critical role in the ideas developed in the 
following sections, and their significance will emerge quite clearly in the 
proper context. 


Problems 


1. Let M be a closed linear subspace of a normed linear space N, and 
let xo be a vector not in M. If d is the distance from x to M, show 
that there exists a functional fy in N* such that fo(M) = 0, fo(ao) = 1, 
and || fol| = 1/d. 

2. Prove that a normed linear space N is separable if its conjugate space 
N* is. (Hint: let {fn} be a countable dense set in N* and {z,} a 
corresponding set in N such that ||znl| <1 and |fa(2a)| > |lfnll/2; 
let M be the set of all linear combinations of the z,’s whose coeffi- 
cients are rational or—if N is complex—have rational real and 
imaginary parts; and use Theorem C to show that M = N.) We 
remark that N* need not be separable when N is, for 1, is easily 
proved to be separable, 1,* = 1., and 1.. is not separable (see Problem 
18-4). 

3. In this problem we ask the reader to convince himself of the validity 
of Eqs. (2) to (4). Let Z be the linear space of all n-tuples 


x= (a1, Ba . ~~» Ln) 


of scalars. If {e:, e:, ..., én} is the natural basis described in 
Problem 47-6, then x = 21¢: + t2@2 + °°: + n€n; and if f is 
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any scalar-valued linear function defined on ZL, then the equation 
f(x) = aif(ex) + aef(e2) + - + + + 2nf(en) shows that f determines, 
and is determined by, the n scalars y; = f(e). The mapping 


y= (y1, Y2, see Yn) s, 


where f(x) = 22, zi, is clearly an isomorphism of L onto the 
linear space L’ of all f’s. When the space L of all z’s is made into 
(1 < p < ©) by suitably defining its norm, then by Problem 47-6 
the space L’ of all f’s equals its conjugate space (I3)*, where the norm 
of f is understood to be given by 


fl| = inf {K:K > 0 and |f(z)| < K|lz|| for all x}. 


All that remains is to see what norm for the y’s makes the mapping 

y — f an isometric isomorphism. 

(a) If 1<p< _o, then (3)* =1j. The norm in this case is 
defined by |{z|| = (22, |2:|?)/, and it follows from 


If(2)| = > nay < . lead < (> leslr)" (> Ins) 


that [lf] < (Zhi lyd)”*. Show that [fll = ii lyde)v* by 
considering the vector x defined by x; = 0 if y; = 0 and 


as = |yil*/ys 

otherwise. 

(b) ()* =1%. Here we have ||x|| = 2, |x|, and it follows from 
[f(x)| = [Zia tys| S Zia led [yl < max {ly|} (Zi |ze) that 
/ll < max {yd}. Show that |||] = max {[y:]} by considering 
the vector x defined by x; = |y\/y; if ly.| = max {ly.|} and 
a; = 0 otherwise. 

(c) (l%)* =U. In this case, the norm is defined by 


llel| = max {lz}, 


and it follows from 
If()| = I> ay S > xsl Iys| << max {|x} 02 PAD) 


that ||fl] < 22, |y.|. Show that ||fl| = 22, |y.| by considering 

the vector x defined by z=0 if y;=0 and 2; = |y|/y: 
otherwise. 

4. The following generalized form of part of the Hahn-Banach theorem 

is useful in certain problems of measure theory. Prove it by suitably 

modifying the arguments given in the text. If p is a real function 
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defined on a real linear space L such that p(ax) = ap(xr) for a > 0 
and p(x + y) < p(x) + p(y), and if f is a real linear function defined 
on a linear subspace M such that f(z) < p(x) for all x in M, then 
f can be extended to a real linear function fo defined on LZ such that 
fo(x) < p(x) for all x in L. 


49. THE NATURAL IMBEDDING OF N IN N** 


Since the conjugate space N* of a normed linear space N is itself a 
normed linear space, it is possible to form the conjugate space (N*)* of 
N*. We denote this space by N**, and we call it the second conjugate 
space of N. 

The importance of N** rests on the fact that each vector x in N 
gives rise to a functional F, in N**. If we denote a typical element of 
N* by f, then F, is defined by 


F.(f) = f(z). 


In other words, we invert the usual practice by regarding the symbol 
f(x) as specifying a function of f for each fixed z, and we emphasize this 
point of view by writing f(z) in the form F.(f). A simple manipulation 
of the definition shows that F, is linear: 


Fi(of + Bg) = (af + 8g) (x) 
af(x) + Bg(z) 
= oF ,(f) + BF.(g). 


If we now compute the norm of F., we see that 


Fell = sup {|F.(A)|:IIFll < 13 
sup {|f(z)|:Ilfll < 1} 
Val ELF] Mach =IFH <1) 
ZI. 


IAIA Il 


Theorem 48-B is exactly what is needed to guarantee that equality 
holds here, so for each x in N we have 


Fell = [lzll- 


It follows from these observations that z— F. is a norm-preserving 
mapping of NW into N**. F, is called the functional on N* induced by 
the vector z, and we refer to functionals of this kind as induced functionals. 
We next point out that the mapping x — F, is linear and is therefore an 
isometric isomorphism of N into N**. To verify this, we must show that 
Fyiy(f) = (Fz. + F,)(f) and Fa:(f) = (aF.)(f) for every f in N*. The 
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first of these relations follows from 


Foiy(f) = f@ + y) 
= f(x) + fy) 
= Ff) + Ff) 
= (F. + F,)(f), 


and the second is proved similarly. The isometric isomorphism xz — F, 
is called the natural imbedding of N in N**, for it allows us to regard N as 
part of N** without altering any of its structure as a normed linear space. 
We write 

NC N**, 


where this set inclusion is to be understood in the sense just explained. 
A normed linear space JN is said to be reflexive if N = N** The 
spaces I, (and L,) for 1 < p < © are reflexive, for /,* = 1, and 


1,** = 1,* = ly. 


It follows from Problem 48-3 that the spaces [3 for 1 < p < @ are also 
reflexive. Since N** is complete, N is necessarily complete if it is 
reflexive. If N is complete, however, it is not necessarily reflexive, as we 
see from co* = 1; and co** = 1,* =1,. If X is a compact Hausdorff 
space, it can be shown that @(X) is reflexive = X is a finite set. 

There is an interesting criterion for reflexivity, which depends on the 
concept of the weak topology on a normed linear space N. This is 
defined to be the weak topology on N generated by the functions in N* 
in the sense of Sec. 19; that is, it is the weakest topology on N with 
respect to which all the functions in V* remain continuous. The criterion 
referred to is the following: if B isa Banach space, and if S = {z:|lz]] < 1} 
is its closed unit sphere, then B is reflexive = S is compact in the weak 
topology. This fact is something one should know about Banach spaces, 
but we shall have no need for it ourselves, so we state it without proof.} 

Far more important for our purposes is the weak* topology on N*, 
which is defined to be the weak topology on N* generated by all the 
induced functionals F, in N**. This situation is rather complicated, so 
we shall try to make clear just what is going on. 

First of all, N* (like N) is a normed linear space, and it therefore 
has a topology derived from its character as a metric space. This is 
called the strong topology. N** is the set of all scalar-valued linear 
functions defined on N* which are continuous with respect to its strong 
topology. The weak topology on N* (like the weak topology on N) is the 
weakest topology on N* with respect to which all the functions in N** are 
continuous, and clearly this is weaker than its strong topology. So far, as 


1See Hille and Phillips [20, p. 38] or Dunford and Schwartz [8, p. 425]. 
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we have indicated, these concepts apply equally to N and N*. However, 
since N* is the conjugate space of N, the natural imbedding enables us to 
consider N as part of N**. We now form the weakest topology on N* 
with respect to which all the functions in N—regarded as a subset of 
N**—remain continuous. This is the weak* topology, and it is evidently 
weaker than the weak topology. The weak* topology can be given a 
more explicit description, in which its defining subbasic open sets are 
displayed. Consider a vector z in N and its induced functional F, in N**. 
The weak* topology on N* is the weakest topology under which all such 
F,’s are continuous. If fy is an arbitrary element in N*, and if « > 0 is 
given, then theset 


S(a, fo, -) = {f:fe N* and |F.(f) — F.(fo)| < ¢} 
{f:f ¢ N* and |f(x) — folz)| < ¢} 


is an open set (in fact, a neighborhood of fo) in the weak* topology. 
Furthermore, the class of all sets of this kind, for all 2’s, fo’s, and ¢’s, is 
the defining open subbase for the weak* topology. Ill finite intersections 
of these sets constitute an open base for this topology, and the open sets 
themselves are all unions of these finite intersections. 

We remark at this point that N* is a Hausdorff space with respect 
to its weak* topology. This follows at once from the fact that if f and g 
are distinct functionals in N*, then there must exist a vector x in N such 
that f(z) ¥ g(x); for if we put e = |f(x) — g(x)|/3, then S(z, f, «) and 
S(z, g, €) are disjoint neighborhoods of f and g in the weak* topology. 

Let us now consider the closed unit sphere S* in N*, that is, the set 
S* = {f:feN* and ||f|] < 1}.1 It is an easy consequence of Problem 2 
that S* is compact in the strong topology = N is finite-dimensional, so 
the strong compactness of S* is a very stringent condition. If N is 
complete, it follows from Problem 3 and our unproved criterion for 
reflexivity that S* is compact in the weak topology = J is reflexive, so 
the weak compactness of S* is still a fairly substantial restriction. We 
state these facts to emphasize that the situation is quite different with 
the weak* topology, for here S* is always compact. 


Theorem A. If N is a normed linear space, then the closed unit sphere S* 
in N* is a compact Hausdorff space in the weak* topology. 

proor. We already know that S* is a Hausdorff space in this topology, 
so we confine our attention to proving compactness. With each vector 
x in N we associate a compact space C,, where C, is the closed interval 
[—llzll,||z]]] or the closed dise {z:|z| < ||z||}, according as N is real 
or complex. By Tychonoff’s theorem, the product C' of all the C,’s is 


1 When we use the adjective ‘‘closed”’ in referring to S*, we intend only to empha- 
size the inequality ||f|| < 1, as contrasted with ||f|| < 1. 
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also a compact space. For each z, the values f(z) of all f’s in S* lie in 
Cz. This enables us to imbed S* in C by regarding each f in S* as 
identical with the array of all its values at the vectors zin N. It is clear 
from the definitions of the topologies concerned that the weak* topology 
on S* equals its topology as a subspace of C’; and since C is compact, it 
suffices to show that S* is closed as a subspace of C. We show that if g 
is in §*, then g isin S*. If we consider g to be a function defined on the 
index set N, then since g is in C we have |g(zx)| < ||zx|| for every x in N. 
It therefore suffices to show that g is linear as a function defined on N. 
Let « > 0 be given, and let x and y be any two vectorsin N. Every 
basic neighborhood of g intersects S*, so there exists an f in S* such that 
lg(x) — f(x)| < «/3, |g(y) — fly)| < €/3, and [g(a +y) —fiz+y)| < 
¢/3. Since fis linear, f(z + y) — f(x) — f(y) = 0, and we therefore have 


lg + y) — g(x) — gfy)| = Ilo + y) —f(et+y)] — (Io) — f(z) 
— 9%) — FYI < lo + y) — fa + y)| + loz) — f(a)| 
+ |g(y) — f(y)| < /3 + 3 + ¢/3 =e. 


The fact that this inequality is true for every « > 0 now implies that 
g(x + y) = g(x) + g(y). We can show in the same way that 


g(ex) = ag(zx) 


for every scalar a, so g is linear and the theorem is proved. 


We are now in a position to keep the promise made in the last 
paragraph of Sec. 46, for the following result is an obvious consequence 
of our preceding work, 


Theorem B. Let N be a normed linear space, and let S* be the compact 
Hausdorff space obtained by imposing the weak* topology on the closed unit 
sphere in N*. Then the mapping x — F,, where F:(f) = f(x) for each f in 
S*, ts an isometric isomorphism of N into C(S*). If N is a Banach space, 
this mapping is an isometric isomorphism of N onto a closed linear subspace 


of C(S*). 


This theorem shows, in effect, that the most general Banach space is 
essentially a closed linear subspace of @€(X), where X is a compact 
Hausdorff space. The purpose of representation theorems in abstract 
mathematics is to reveal the structures of complex systems in terms of 
simpler ones, and from this point of view, Theorem B is satisfying to a 
degree. It must be pointed out, however, that we know next to nothing 
about the closed linear subspaces of C(X), though we know a good deal 
about @(X) itself. Theorem B is therefore somewhat less revealing than 
appears at first glance. We shall see in Chaps. 13 and 14 that the 
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corresponding representation theorem for Banach algebras is much more 
significant and useful. 


Problems 


1. Let X be a compact Hausdorff space, and justify the assertion that 
@(X) is reflexive if X is finite. 

2. If N isa finite-dimensional normed linear space of dimension n, show 
that N* also has dimension n. Use this to prove that N is reflexive. 

3. If B is a Banach space, prove that B is reflexive = B* is reflexive. 

4. Prove that if B is a reflexive Banach space, then its closed unit 
sphere S is weakly compact. 

5. Show that a linear subspace of a normed linear space is closed = it is 
weakly closed. 


50. THE OPEN MAPPING THEOREM 


In this section we have our first encounter with basic theorems which 
require that the spaces concerned be complete. The following rather 
technical lemma is the key to these theorems. 


Lemma. Jf B and B’ are Banach spaces, and if T is a continuous linear 
transformation of B onto B’, then the image of each open sphere centered on 
the origin in B contains an open sphere centered on the origin in B’. 
PROOF. We denote by S, and S, the open spheres with radius r centered 
on the origin in B and B’. It is easy to see that 


T(S,) = T(rS;) = rT(S8)), 


so it suffices to show that 7(S;) contains some S,. 

We begin by proving that 7(8,) contains some S;. Since T is onto, 
we see that B’ = Uz, T(S,). B’ is complete, so Baire’s theorem 
implies that some 7(S,,,) has an intcrior point yo, which may be assumed 
to lie in T(S,,). The mapping y— y — y) is a homeomorphism of B’ 
onto itself, so T(S,,) — yo has the origin as an interior point. Since yo 
is in T(S,,), we have T(S,,) — yo GC T(Son,); and from this we obtain 
T(Sn,) — Yo = T(S2,) — Yo © T(Sen,), Which shows that the origin is an 
interior point of 7(So,,)- Multiplication by any non-zero scalar is a 
homeomorphism of B’ onto itself, so T(S2,,) = 2noT(Si) = 2no7'(S,); and 
it follows from this that the origin is also an interior point of 7'(8), so 
S. © T(S)) for some positive number e. 

We conclude the proof by showing that S; C 7(S3), which is clearly 
equivalent to S{,, C T(S:). Let y be a vector in B’ such that |ly|] < . 
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Since y is in 7(S;), there exists a vector z; in B such that |[z,l! < 1 and 
lly — yil| < «€/2, where y: = T(z1). We next observe that Si, C 
T(Sy), so there exists a vector z: in B such that ||x2|| < 14 and 
ll(y — y1) — yall < €/4, where ys = T(x). Continuing in this way, we 
obtain a sequence {z,} in B such that ||z,|| < 1/2"! and |ly — (yi + ye + 
+++ + y,)|| < €/2", where y, = T(zn). If weput 


Si = Zit tet + +2, 


then it follows from ||z,|| < 1/2*-' that {s,} isa Cauchy sequence in B for 
which 


llsall < + lel] +--+ + |leni] <1 +4+--° 1/21 <2. 


B is complete, so there exists a vector z in B such that s,— 2; and 
\|z|| = |}lim s,|| = lim ||s,|| < 2 <3 shows that z is in S;. All that 
remains is to notice that the continuity of T yields 


T(x) = T(lims,) = lim T(s,) = lim (yi + ye + °° * + Yn) =¥Y; 
from which we see that y is in T(S3). 


ay 


This makes our main theorem easy to prove. 


Theorem A (the Open Mapping Theorem). Jf B and B’ are Banach 
spaces, and tf T is a continuous linear transformation of B onto B’, then 
T is an open mapping. 

PRooF. We must show that if G is an open set in B, then 7(G) is also 
an open set in B’. If y is a point in 7(G), it suffices to produce an open 
sphere centered on y and contained in 7(G). Let x be a point in G 
such that T(x) = y. Since G is open, zis the center of an open sphere— 
which can be written in the form z + S,—contained in G. Our lemma 
now implies that T(S,) contains some S/,. It is clear that y + S;, is an 
open sphere centered on y, and the fact that it is contained in T(@) follows 
at once from y+ S, Cy+ 7T(S,) = T(z) + T(S,) = T(z + S,) CT(Q). 


n— 


Most of the applications of the open mapping theorem depend more 
directly on the following special case, which we state separately for the 
sake of emphasis. 


Theorem B. A one-to-one continuous linear transformation of one Banach 
space onto another is a homeomorphism. In particular, if a one-to-one 
linear transformation T of a Banach space onto itself is continuous, then its 
inverse T—' is automatically continuous. 


As our first application of Theorem B, we give a geometric charac- 
terization of the projections on a Banach space. The reader will recall 
from Sec. 44 that a projection E on a linear space L is simply an idem- 
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potent (EZ? = £) linear transformation of ZL into itself. He will also 
recall that projections on L can be described geometrically as follows: 

(1) a projection E determines a pair of linear subspaces M and N 

such that L = M @N, where M = {E(z):x¢€ L} and N = 
{x:E(x) = 0} are the range and null space of E; 
(2) a pair of linear subspaces M and N such that L = M @N 

determines a projection E whose range and null space are 
M and N (if 2 =2z+y is the unique representation of a 
vector in Las a sum of vectors in M and N, then E is defined by 
E(z) = 2). 

These facts show that the study of projections on L is equivalent to the 

study of pairs of linear subspaces which are disjoint and span L. 

In the theory of Banach spaces, however, more is required of a 
projection than mere linearity and idempotence. A projection on a 
Banach space B is an idempotent operator on B; that is, it is a projection 
on B in the algebraic sense which is also continuous. Our present task 
is to assess the effect of the additional requirement of continuity on the 
geometric descriptions given in (1) and (2) above. The analogue of (1) 
is easy. 


Theorem C. If P is a projection on a Banach space B, and if M and N 
are its range and null space, then M and N are closed linear subspaces of B 
such that B = M @ N. 

proor. FP is an algebraic projection, so (1) gives everything except 
the fact that M and WN are closed. The null space of any continuous 
linear transformation is closed, so N is obviously closed; and the fact 
that 1 is also closed is a consequence of 


M = {P(z):2¢ B} = {x:P(x) = x} = {2:(7 — P)(z) = 0}, 
which exhibits M as the null space of the operator J — P. 


The analogue of (2) is more difficult, for Theorem B is needed in its 
proof. 


Theorem D. Let B be a Banach space, and let M and N be closed linear 
subspaces of B such that B= MON. If z=x-+y is the unique 
representation of a vector in B as a sum of vectors in M and N, then the 
mapping P defined by P(z) = x is a projection on B whose range and null 
space are M and N. 

Proor. Everything stated is clear from (2) except the fact that P is 
continuous, and this we prove as follows. By Problem 46-2, if B’ 
denotes the linear space B equipped with the norm defined by 


llell’ = [lzil + llyl, 
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then B’ is a Banach space; and since |{P(z)|| = ||z|| < |lzl| + lll] = llell’, 
P is clearly continuous as a mapping of B’ into B. It therefore suffices 
to prove that B’ and B have the same topology. If 7 denotes the 
identity mapping of B’ onto B, then 


I7@Il = llell = le + yll < lle) + My) = lel’ 


shows that T is continuous as a one-to-one linear transformation of B’ 
onto B. Theorem B now implies that 7 is a homeomorphism, and the 
proof is complete. 


This theorem raises some interesting and significant questions. Let 
M be a closed linear subspace of a Banach space B. As we remarked at 
the end of Sec. 44, there is always at least one algebraic projection defined 
on B whose range is M, and there may be a great many. However, it 
might well happen that none of these are continuous, and that conse- 
quently none are projections in our present sense. In the light of our 
theorems, this is equivalent to saying that there might not exist any 
closed linear subspace N such that B = M ®N. What sorts of Banach 
spaces have the property that this awkward situation cannot occur? 
We shall see in the next chapter that a Hilbert space—which is a special 
type of Banach space—has this property. We shall also see that this 
property is closely linked to the satisfying geometric structure which 
sets Hilbert spaces apart from general Banach spaces. 

We now turn to the closed graph theorem. Let B and B’ be Banach 
spaces. If we define a metric on the product B X B’ by 


d((x1,y1), (2,y2)) = max {||z1 ~ ell, lly: — yall}, 


then the resulting topology is easily seen to be the same as the product 
topology, and convergence with respect to this metric is equivalent to 
coordinatewise convergence. Now let T be a linear transformation of 
B into B’. We recall that the graph of T is that subset of B X B’ 
which consists of all ordered pairs of the form (z,T(z)). Problem 26-6 
shows that if 7 is continuous, then its graph is closed as a subset of 
BX B’. In the present context, the converse is also true. 


Theorem E (the Closed Graph Theorem). Jf B and B’ are Banach 
spaces, and if T ts a linear transformation of B into B’, then T is continuous 
> its graph is closed. 

PRooF. In view of the above remarks, we may confine our attention 
to proving that T is continuous if its graph is closed. We denote by B 
the linear space B renormed by |[x||; = |{z|| + ||7(z)|]. Since 


IZ (@)ll < llell + 7@l] = lath, 
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T is continuous as a mapping of B, into B’. It therefore suffices to show 
that B and B, have the same topology. The identity mapping of B, 
onto B is clearly continuous, for ||z{| < ||z|| + |]7(z)|| = |lzll1. If we can 
show that B; is complete, then Theorem B will guarantee that this 
mapping is a homeomorphism, and this will conclude the proof. Let 
{z,} be a Cauchy sequence in B,. It follows that {z,} and {7(z,)} are 
also Cauchy sequences in B and B’; and since both of these spaces are 
complete, there exist vectors z and y in B and B’ such that ||z, — z|| > 0 
and ||7(z,) — y|| > 0. Our assumption that the graph of 7 is closed 
in B X B’ implies that (z,y) lies on this graph, so T(r) = y. The 
completeness of B, now follows from 


lz, — 2|l1 = llzn ~ 2|| + ||7@, — 2)l] = llza — ll + ||T(en) — T(a)|| 
= Ilan = || + || T (xn) —_ y|| — 0. 


The closed graph theorem has a number of interesting applications 
to problems in analysis, but since our concern here is mainly with matters 
of algebra and topology, we do not pause to illustrate its uses in this 
direction.! 


Problems 


1, Let a Banach space B be made into a Banach space B’ by means of a 
new norm, and show that the topologies generated by these norms 
are the same if either is stronger than the other. 

2. In the text, we used Theorem B to prove the closed graph theorem. 
Show that Theorem B is a consequence of the closed graph theorem. 

3. Let T be a linear transformation of a Banach space B into a Banach 
space B’. If {f;} is a set of functionals in B’* which separates the 
vectors in B’, and if f:T is continuous for each f;, prove that T' is 
continuous. 


51. THE CONJUGATE OF AN OPERATOR 


We shall see in this section that each operator T on a normed linear 
space N induces a corresponding operator, denoted by 7* and called the 
conjugate of TJ', on the conjugate space N*. Our first task is to define T*, 
and our second is to investigate the properties of the mapping T — T*. 
We base our discussion on the following theorem. 


Theorem A (the Uniform Boundedness Theorem). Let B be a Banach 
space and N a normed linear space. If {T;} is a non-empty set of con- 


1See Taylor [41, pp. 181-185]. 
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tinuous linear transformations of B into N with the property that {T;(x)} 
is a bounded subset of N for each vector x in B, then {||T;||} ts a bounded set 
of numbers; that is, {T;} is bounded as a subset of @(B,N). 


PROOF. For each positive integer n, the set 
F, = {x:2¢B and ||7,(z)|| < n for all 7} 
is clearly a closed subset of B, and by our assumption we have 
B= U8 Fa. 


Since B is complete, Baire’s theorem shows that one of the F,’s, say 
F,,,, has non-empty interior, and thus contains a closed sphere So with 
center x) and radius r, > 0. This says, in effect, that each vector in 
every set 7':(So) has norm less than or equal to mo; and for the sake of 
brevity, we express this fact. by writing ||7:(So)|| < mo. It is clear that 
So — 2 is the closed sphere with radius ro centered on the origin, so 
(Sy) — 2o)/ro is the closed unit sphere S. Since x is in So, it is evident 
that ||7(So — z0)|| < 2m. This yields ||7:(S)|| < 2mo/ro, so ||T7il| < 
2no/ro for every 7, and the proof is complete. 


This theorem is often called the Banach-Steinhaus theorem, and it 
has several significant applications to analysis. See, for example, 
Zygmund [46, vol. 1, pp. 165-168] or G4l [11]. For the purposes we 
have in view, our main interest is in the following simple consequence of it. 


Theorem B. A non-empty subset X of a normed linear space N is bounded 
= f(X) ts a bounded set of numbers for each f in N*. 

prooF. Since |f(zx)| < ||/f\l ||z||, it is obvious that if X is bounded, then 
{(X) is also bounded for each f. 

In order to prove the other half of the theorem, it is convenient to 
exhibit the vectors in X by writing X = {z;}. We now use the natural 
imbedding to pass from X to the corresponding subset {F.,} of N**. 
Our assumption that f(X) = {f(z,)} is bounded for each f is clearly 
equivalent to the assumption that {F.,(f)} is bounded for each f, and 
since N* is complete, Theorem A shows that {F;,} is a bounded subset 
of N**. We know that the natural imbedding preserves norms, so X is 
evidently a bounded subset of N. 


We now turn to the problem of defining the conjugate of an operator 
on a normed linear space N. 

Let L be the linear space of all scalar-valued linéar functions defined 
on N. The conjugate space N* is clearly a linear subspace of L. Let 
T be a linear transformation of N into itself which is not necessarily 
continuous. We use T to define a linear transformation T”’ of L into 
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itself, as follows. If fis in L, then T’(f) is defined by 


[T’(A\(@) = f(T)). (1) 


We leave it to the reader to verify that T’(f) actually is linear as a func- 
tion defined on N, and also that 7” is linear as a mapping of L into itself. 

The following natural question now presents itself. Under what 
circumstances does T’ map N* into N*? This question has a simple and 
elegant answer: T’(N*) C N* = T is continuous. If we keep Theorem 
B in mind, the proof of this statement is very easy; for if S is the closed 
unit sphere in N, then T is continuous = T(S) is bounded = f(7(S)) is 
bounded for each f in N* = [T’(f)](S) is bounded for each fin N* = T’(f) 
is in N* for each f in N*. 

We now assume that the linear transformation T is continuous and is 
therefore an operator on N. The preceding developments allow us to 
consider the restriction of T’ to a mapping of N* into itself. We denote 
this restriction by T*, and we call it the conjugate of T. The action of 
T* is given by 


IT*(f)(x) = f(T), (2) 


in which—in contrast to (1)—/f is understood to be a functional on N, 
and not merely a scalar-valued linear function. T* is clearly linear, and 
the following computation shows that it is continuous: 


|7*| = sup {|7*(Il=|Fll < 1} 

sup {|[7*(f)](@)|: | fl] and |lz|| < 1} 
sup {|f(7(z))|:|]f|| and ||zl] < 1} 
14 UI ZU Mls (71) and |Ix|| < 1} 


AIA Il 


Since ||7|| = sup {|]7(z)||:|lz|| < 1}, we see at once from Theorem 48-B 
that equality holds here, that is, that 


7*ll = 17. (3) 


The mapping T — T* is thus a norm-preserving mapping of @(N) into 
@(N*). 

We continue in this vein by observing that the mapping T— T* 
also has the following pleasant algebraic properties: 


(aT; + BT2)* = aT \* + BT.*, (4) 
(T:T2)* = T.*Ty*, (5) 
and =f. (6) 


The proofs of these facts are easy consequences of the definitions. We 
illustrate the principles involved by proving (5). It must be shown that 
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(T1T2)*(f) = (12*T1*)(f) for each f in N*, and this means that 
(7172) * (fe) = [(T2*T1*)(f)] (x) 


for each fin N* and each zin N. A simple computation now shows that 


(TiP2)*(f)I@) = f(TiT2)(2)) = f(T(T2(e))) = (Ti*)(T2)) 
= (P2*(Ti*(f))(@) = [(T2*71*)(/)]@). 


It may be helpful to the reader to have the following summary of 
the results of this discussion. 


Theorem C. Jf T is an operator on a normed linear space N, then its 
conjugate T* defined by Eg. (2) is an operator on N*, and the mapping 
T — T* is an isometric isomorphism of @(N) into @(N*) which reverses 
products and preserves the identity transformation. 


The general significance of the ideas developed here can be under- 
stood only in the light of the theory of operators on Hilbert spaces. 
Some preliminary comments on these matters are given in the introduc- 
tion to the next chapter. 


Problems 


1. Let B be a Banach space and N a normed linear space. If {7,} isa 
sequence in @(B,N) such that T(x) = lim T,,(z) exists for each x in 
B, prove that T is a continuous linear transformation. 

2. Let T be an operator on a normed linear space N. If N is considered 
to be part of N** by means of the natural imbedding, show that T** 
is an extension of 7. Observe that if N is reficxive, then T** = T. 

3. Let 7 be an operator on a Banach space B. Show that 7 has an 
inverse T~1<>T* has an inverse (7*)~}, and that in this case 
(T*)-) = (T-)*. 


CHAPTER TEN 


Hilbert Spaces 


One of the principal applications of the theory of Banach algebras 
developed in Chaps. 12 to 14 is to the study of operators on Hilbert 
spaces. Our purpose in this chapter is to present enough of the ele- 
mentary theory of Hilbert spaces and their operators to provide an 
adequate foundation for the deeper theory discussed in these later 
chapters. 

We shall see from the formal definition given in the next section that 
a Hilbert space is a special type of Banach space, one which possesses 
additional structure enabling us to tell when two vectors are orthogonal 
(or perpendicular). The first part of the chapter is concerned solely with 
the geometric implications of this additional structure. 

As we have said before, the objects of greatest interest in connection 
with any linear space are the linear transformations on that space. In 
our treatment of Banach spaces, we took advantage of the metric struc- 
ture of such a space by focusing our attention on its operators. A 
Banach space, however, is still a bit too general to yield a really rich 
theory of operators. One fact that did emerge, which is of great signifi- 
cance for our present work, is that with each operator T on a Banach 
space B there is associated an operator 7* (its conjugate) on the con- 
jugate space B*. We shall see below that one of the central properties of 
a Hilbert space H is that there is a natural correspondence between 
and its conjugate space H*. If T is an operator on H, this correspond- 
ence makes it possible to regard the conjugate 7'* as acting on H itself 
(instead of H*), where it can be compared with T. These ideas lead to 
the concept of the adjoint of an operator on a Hilbert space, and they 
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make it easy to understand the importance of operators (such as self- 
adjoint and normal operators) which are related in simple ways to their 
adjoints. 


52. THE DEFINITION AND SOME SIMPLE PROPERTIES 


The Banach spaces studied in the previous chapter are little more 
than linear spaces provided with a reasonable notion of the length of a 
vector. The main geometric concept missing in an abstract space of this 
type is that of the angle between two vectors. The theory of Hilbert 
spaces does not hinge on angles in general, but rather on some means of 
telling when two vectors are orthogonal. 

In order to see how to introduce this concept, we begin by considering 
the three-dimensional Euclidean space R?. A vector in R? is of course 
an ordered triple « = (21, x2, x3) of real numbers, and its norm is defined 
by 

lal] = laa]? + [ael? + |23]?)*. 


In elementary vector algebra, the inner product of z and another vector 
y = (Yi, ¥2, ys) is defined by 
(x,y) = xiyi + xoy2 + rays, 
and this inner product is related to the norm by 
(z,z) = ||2l*. 
We assume that the reader is familiar with the equation 
(x,y) = |lzll llyl| cos 9, 


where @ is the angle between z and y, and also with the fact that x and y 
are orthogonal precisely when (x,y) = 0. 

Most of these ideas can readily be adapted to the three-dimensional 
unitary space C*. For any two vectorsz = (21, 22, 23) andy = (Yy1, yz, Ys) 
in this space, we define their inner product by 


(x,y) = tiys + tayo + rays. (1) 
Complex conjugates are introduced here to guarantee that the relation 
(2,2) = |lz\l? 


remains true. It is clear that the inner product defined by (1) is linear 
as a function of x for each fixed y, and is also conjugate-symmetric, in the 


1The term dot product and the notation z-y are used in most introductory 
treatments of vectors. 
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sense that (z,y) = (y,z). In this case, it is no longer possible to think 
of (z,y) as representing the product of the norms of z and y and the 
cosine of the angle between them, for (x,y) is in general a complex num- 
ber. Nevertheless, if the condition (z,y) = 0 is taken as the definition 
of orthogonality, then this concept is just as useful here as it is in the real 
case. 

With these ideas as a background, we are now in a position to give 
our basic definition. A Hilbert space is a complex Banach space whose 
norm arises from an inner product, that is, in which there is defined a 
complex function (z,y) of vectors x and y with the following properties: 

(1) (ax + By, 2) = a(z,z) + B(y,z); 

(2) Gy) = 2); 

(3) (2,2) = |l2|l?. 


It is evident that the further relation 
(x, ay + Bz) = a(z,y) + B(z,z) 


is a direct consequence of properties (1) and (2). 

The reader may wonder why we restrict our attention to complex 
spaces. Why not consider real spaces as well? As a matter of fact, 
we could easily do so, and many writers adopt this approach. There 
are a few places in this chapter where complex scalars are necessary, but 
the theorems involved are not crucial, and we could get along with real 
scalars without too much difficulty. It is only in the complex case, 
however, that the theory of operators on a Hilbert space assumes a really 
satisfactory form. This will appear with particular clarity in the next 
chapter, where we make essential use of the fact that every polynomial 
equation of the nth degree with complex coefficients has exactly n 
complex roots (some of which, of course, may be repeated). For this 
and other reasons, we limit ourselves to the complex case throughout the 
rest of this book. 

The following are the main examples of Hilbert spaces. In accord- 
ance with the above remarks, the scalars in each example are understood 
to be the complex numbers. 


Example 1. The space [3, with the inner product of two vectors . 


X= (M1, t2, ... ,2n) and y = (y1, ye, vee > Yn) 
defined by 


(ey) = > nH 


t=1 


It is obvious that conditions (1) to (3) are satisfied. 
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Example 2. The space Jz, with the inner product of the vectors 
w= {21,%, ..-.,2n,..-} andy = fyi, yx... Ya... =} 
defined by 
(z,y) res > Dalle 
n=l 


The fact that this series converges—and thus defines a complex number— 
for each x and y in /, is an easy consequence of Cauchy’s inequality. 


Example 3. The space Le associated with a measure space X with 
measure m, with the inner product of two functions f and g defined by 


(f.0) = f S@gCa) dm(x). 


This Hilbert space is of course not part of the official content of this book, 
but we mention it anyway in case the reader has some knowledge of these 
matters. 


As our first theorem, we prove a fundamental relation known as the 
Schwarz inequality. 


\ Lert y)yl Theorem A. If x and y are any two vectors in a 


\/ ‘Hilbert space, then |(z,y)| < Ia lll 

x pRooF. When y = 0, the result is clear, for both 

i$ \ sides vanish. When y =~ 0, the inequality is equiv- 
\\ alent to |(z,y/llyl|)| < |lzl]. We may therefore con- 

Pe fine our attention to proving that if ||y|| = 1, then 


Fa aey, xy} We have |(z,y)| < ||z|| for all z This is a direct 
; consequence of the fact that 


1», OS lle — @y)yll? = @ — @y)y,  — @y)y) 
Fig. 36. Sch 8 ? 
jnaqd ey: chnwarz s = (x,x) = (x,y) (x, ) a (x,y) (z,y) + (x,y) (x,y) (yy) 
= (2,2) — (xy)@,y) = Ilzll? — 1@y)/. 
An inspection of Fig. 36 will reveal the geometric motivation for this 
computation. 


It follows easily from Schwarz’s inequality that the inner product 
in a Hilbert space is jointly continuous: 


Zn— Zand Yn— Y = (Ln,Yn) > (2,y). 
To prove this, it suffices to observe that 


|(Zn;Ya) ca (x,y)| = | (tn,Yn) = (xn,y) + (Xn,Y) = (z,y)| < | (ta,Yn) 
— (tn,y)| + |(en,y) — (%,y)| = [as yn — YD! 
+ |(tn — 2, y)| S lznll lyn — yll + [lee — all lly. 
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A well-known theorem of elementary geumetry states that the sum 
of the squares of the sides of a parallelogram equals the sum of the squares 
of its diagonals. This fact has an analogue in the present context, for 
in any Hilbert space the so-called parallelogram law holds: 


lle + yl]? + Ile — yll? = 2llx[l® + ily’. 


This is readily proved by writing out the expression on the left in terms 
of inner products: 


lz + yll? + la — yl? = @t+yct+ty+@-y,2-y) 
= (z,2) + (x,y) + (y,2) + (yy) + (2,2) — (ay) — (y,2) + (yy) 
= 2(z,2) + 2(y,y) = 2\\z||? + 2Ilyll?. 


The parallelogram Jaw has the following important consequence for 
our work in the next section. 


Theorem B. A closed convex subset C of a Hilbert space H contains a 
unique vector of smallest norm. 

PROOF. We recall from the definition in Problem 32-5 that since C is 
convex, it is non-empty and contains (z + y)/2 whenever it contains 
xz and y. Let d = inf {|lz||:x¢C}. There clearly exists a sequence 
{zn} of vectors in C such that ||z,||>d. By the convexity of C, 
(tm + %n)/2 is in C and ||(tm + tn)/2|| > d, so |lam + tall > 2d. Using 
the parallelogram law, we obtain 


= — tall? = len? + 2llzal* — Nem + al 
< 2ljznll* + 2cll® — 4a 


and since 2||zn||? +- 2||z,ql|2 — 4d? + 2d? + 2d? — 4d? = 0, it follows that 
{z,} is a Cauchy sequence in C. Since H is complete and C is closed, 
C is complete, and there exists a vector x in C such that z,—> 2. It is 
clear by the fact that ||z|| = [lim z,|] = lim ||z,|] = d that z is a vector 
in C with smallest norm. ‘To see that x is unique, suppose that 2’ is a 
vector in C other than z which also has norm d. Then (z + 2’)/2 is 
also in C’, and another application of the parallelogram law yields 


x— 2’ |? 


z+ 2’ 
2 2 


2 iil? , [ell 
oy eps 


all? , We'll? _ 4 
ag = 


which contradicts the definition of d. 


The parallelogram law has another interesting application, which 
depends on the fact that in any Hilbert space the inner product is related 
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to the norm by the following identity: 
A(z,y) = |lz + yll? — la — yl? + alle + ayll? — iz — cyl 2) 


This is easily verified by converting the expression on the right into inner 
products. 


Theorem C. If Bis a complex Banach space whose norm obeys the paral- 
lelogram law, and if an inner product 1s defined on B by (2), then B is a 
Hilbert space. 

proor: All that is necessary is to make sure that the inner product 
defined by (2) has the three properties required by the definition of a 
Hilbert space. This is easy in the case of properties (2) and (3). Prop- 
erty (1) is best treated by splitting it into two parts: 


(x + y, 2) = (2,2) + (y,2), 


and (az,y) = a(z,y). The first requires the parallelogram law, and the 
second follows from the first. We ask the reader (in Problem 6) to work 
out the details. 


This result has no implications at all for our future work. However, 
it does provide a satisfying geometric insight into the place Hilbert 
spaces occupy among all complex Banach spaces: they are precisely those 
in which the parallelogram law is true. 


Problems 


1. Show that the series which defines the inner product in Example 2 
is convergent. 
2. The Hilbert cube is the subset of J, consisting of all sequences 


2 ={%1, 22, ...,2n,.-.} 


such that |z,| < 1/n for all n. Show that this set is compact as a 
subspace of lp. 
3. For the special Hilbert space I}, use Cauchy’s inequality to prove 
Schwarz’s inequality. 
Show that the parallelogram law is not true in I? (n > 1). 
In a Hilbert space, show that if ||z|| = |ly|| = 1, and if e > 0 is given, 
then there exists 5 > 0 such that ||(z + y)/2|| > 1 — 6= ||z — yl 
<e. A Banach space with this property is said to be uniformly 
convex. Sce Taylor (41, p. 231]. 
6. Give a detailed proof of Theorem C. 


ae 
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53. ORTHOGONAL COMPLEMENTS 


Two vectors x and y in a Hilbert space H are said to be orthogonal 
(written z 1 y) if (x,y) = 0. Thesymbol L is often pronounced “perp.” 
Since (zy) = (y,x), wehavez Ly@y Lz. Itis also clear that z 1 0 
for every x, and (z,r) = ||z||? shows that 0 is the only vector orthogonal 
to itself. One of the simplest geometric facts about orthogonal vectors 
is the Pythagorean theorem: 


xb y= lle + yl? = lle — yll? = loll? + llyll?. 


A vector z is said to be orthogonal to a non-empty set S (written z 1 S) 
if x 1 y for every y in S, and the orthogonal complement of S—denoted 
by S1—is the set of all vectors orthogonal to S. The following state- 
ments are easy consequences of the definition: 


{0} = H; H* = {0}; 
SMS+ €{0}; 
S,C 8, 8,1 D S24; 
S+ is a closed linear subspace of H. 


It is customary to write (S1)+ in the form S++. Clearly, SC S141, 

Let M be a closed linear subspace of H. We know that M+ is also 
a closed linear subspace, and that M and M+ are disjoint in the sense 
that they have only the zero vector in common. Our aim in this section 
is to prove that H = M @ M1, and each of our theorems is a step in 
this direction. 


Theorem A. Let M be a closed linear subspace of a Hilbert space H, let 
x be a vector not in M, and let d be the distance from x to M. Then there 
exists a unique vector yo in M such that ||z — yol| = d. 

PRooF. The set C = z + M isa closed convex set, and d is the distance 
from the origin to C (see Fig. 37). By Theorem 52-B, there exists a 
unique vector zo in C such that ||zo|| = d. The vector yo = z — 2 is 
easily seen to be in M, and ||z — yol| = |[zo|| = d. The uniqueness of yo 
follows from the fact that if y: is a vector in M such that y: ¥ yo and 
lz — y:l| = d, then 2; = 2 — y; is a vector in C such that 2; ¥ zo and 
\|z:|| = d, which contradicts the uniqueness of zo. 


We use this result to prove 


Theorem B. If M is a proper closed linear subspace of a Hilbert space H, 
then there exists a non-zero vector zo in H such that zo L M. 

prooF. Let x bea vector not in M, and let d be the distance from z to M. 
By Theorem A, there exists a vector yo in M such that ||xz — yol| = d. 
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We define zo by zo = x — yo (see Fig. 37), and we observe that since 
d > 0, 2 is a non-zero vector. We conclude the proof by showing that 
if y is an arbitrary vector in M, then zp) 1 y. For any scalar a, we have 


llzo — aryl] = |la — (yo + ay)|| > ad = fiezell, 


80 llzo — ay||? — |lzoll? > 0 
and —&(20,y) — a(Zo,y) + lel?llyl]? = 0. i) 


If we put a = B(zo,y) for an arbitrary real number £, then (1) becomes 
—26| (Z0,y)|? + 67|(z0,y)[?llyll? > 0. 


If we now put a = |(zo,y)|? and b = |ly||2, we obtain 


—28a + Bab > 0, 
sO Ba(gb — 2) > 0 (2) 


for all real 8. However, if a > 0, then (2) is obviously false for al 
sufficiently small positive 6. We see 
from this that a = 0, which means 
that Zo L y. 


This proof of Theorem B may 
strike the reader as being excessively 
dependent on ingenious computations. 
If so, he will be pleased to learn that 
the ideas developed in the next sec- 
tion can be used to provide another 
proof which is free of computation. 

Fig. 37 In order to state our next theo- 

rem, we need the following additional 

concept. Two non-empty subsets S; and S, of a Hilbert space are said 
to be orthogonal (written S; 1 S2) if x 1 y for all z in S; and y in Sz. 


Theorem C. If M and N are closed linear subspaces of a Hilbert space H 
such that M 1. N, then the linear subspace M + N is also closed. 

PRooF. Let z be a limit point of M+ N. It suffices to show that z is 
in M+ N. There certainly exists a sequence {z,} in M + N such that 
2,— 2. By the assumption that M 1 N, we see that M and N are 
disjoint, so each z, can be written uniquely in the form 2, = 2, + Yn, 
where z, isin M and y, isin N. The Pythagorean theorem shows that 
ll2m — Zn|]® = [lm — 2nl|? + [lym — ynll?, so {zn} and {yn} are Cauchy 
sequences in Mand N. M and N are closed, and therefore complete, so 
there exist vectors x and y in M and N such that z,-> x and yn— y. 
Since z+ y is in M +N, our conclusion follows from the fact that 
z = lim z, = lim (x, + y,) = limz, + limy, = 2+ y. 


The way is now clear for the proof of our principal theorem. 
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Theorem D. Jf M is a closed linear subspace of a Hilbert space H, then 
H=M@O@M}1. 

prooF. Since M and M+ are orthogonal closed linear subspaces of H, 
Theorem C shows that M + M+ is also a closed linear subspace of H. 
We prove that M + M+ equals H. If this is not so, then by Theorem B 
there exists a vector z. ~ 0 such that 2. 1 (M+ M+). This non-zero 
vector must evidently lie in M1 M+ +; and since this is impossible, we 
infer that H = M+ M+. To conclude the proof, it suffices to observe 
that since M and M+ are disjoint, the statement that H = M+ M+ 
can be strengthened to H = M @ M1. 


The main effect of this theorem is to guarantee that a Hilbert space 
is always rich in projections. In fact, if M is an arbitrary closed linear 
subspace of a Hilbert space H, then Theorem 50-D shows that there 
exists a projection defined on H whose range is M and whose null space 
is M+. This satisfactory state of affairs is to be contrasted with the 
situation in a general Banach space, as explained in the remarks following 
Theorem 50-D. 


Problems 


1. If Sisa non-empty subset of a Hilbert space, show that St = Stit, 

2. If M is a linear subspace of a Hilbert space, show that M is closed 
@M=M}4, 

3. If S is a non-empty subset of a Hilbert space H, show that the set 
of all linear combinations of vectors in S is dense in H = S+ = {0}. 

4. If Sisa non-empty subset of a Hilbert space H, show that S+- isthe 
closure of the set of all linear combinations of vectors in S. This is 
usually expressed by saying that S++ is the smallest closed linear 
subspace of H which contains S. 


54. ORTHONORMAL SETS 


An orthonormal set in a Hilbert space H is a non-empty subset of H 
which consists of mutually orthogonal unit vectors; that is, it is a non- 
empty subset {e;} of H with the following properties: 

QQ) t4#j7 >a 1 6; 

(2) |le.|| = 1 for every 2. 

If H contains only the zero vector, then it has no orthonormal sets. If 
H contains a non-zero vector x, and if we normalize + by considering 
e = 2/|\z||, then the single-element set {e} is clearly an orthonormal set. 
More generally, if {z,;} is a non-empty set of mutually orthogonal non- 
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zero vectors in H, and if the z,’s are normalized by replacing each of them 
by e = 2,/|I2,||, then the resulting set {e;} is an orthonormal set. 


Example 1. The subset {e1, e2, . . . , en} of If, where e; is the n-tuple 
with 1 in the 7th place and 0’s elsewhere, is evidently an orthonormal 
set in this space. 


Example 2. Similarly, if e, is the sequence with 1 in the nth place and 
0’s elsewhere, then {e1, €2, . . . ,€n, . . .} isan orthonormal set in I. 


At the end of this section, we give some additional examples taken 
from the field of analysis. 

Every aspect of the theory of orthonormal sets depends in one way 
or another on our first theorem. 


Theorem A. Let {€1, €2, . . . , en} be a@ finite orthonormal set in a Hilbert 
space H. If x is any vector in H, then 


n 


Y |@edl? < lal’; ) 
i=l 
further, 2 > (a,e)e: L g (2) 
i=] 


for each j. 


proor. The inequality (1) follows from a computation similar to that 
used in proving Schwarz’s inequality: 


0<|iz- > (x,e:)e: ||? 


t=1 
= (z = p> (x,ex)es, A eo >> (z,¢;)e;) 


= (2.2) - > Gata -DSeadaayt > 
i=l j=l i=l 
= Il? — > [@eole. 
i=1 


> (x,e:) (2,€;) (ese) 
=] 


j 


To conclude the proof, we observe that 
(z — > Gedexes) = (we) — DY (2) (exe) = (we) — (4) = 0, 
f=1 i=l 
from which statement (2) follows at once. 


The reader should note that the inequality (1) can be given the 
following loose but illuminating geometric interpretation: the sum of the 
squares of the components of a vector in various perpendicular directions 
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does not exceed the square of the length of the vector itself. This is 
usually called Bessel’s inequality, though, as we shall see below, it is only 
a special case of a more general inequality with the same name. Ina 
similar vein, relation (2) says that if we subtract from a vector its com- 
ponents in several perpendicular directions, then the result has no com- 
ponent left in any of these directions. 

Our next task is to prove that both parts of Theorem A generalize 
to the case of an arbitrary orthonormal set. The main problem here 
is to show that the sums in (1) and (2) can be defined in a reasonable way 
when no restriction is placed on the number of e,’s under consideration. 
The key to this problem lies in the following theorem. 


Theorem B. If {e:} 1s an orthonormal set in a Hilbert space H, and if x 
is any vector in H, then the set S = {e;:(2,e;) #0} ts either empty or 
countable. 


proor. For each positive integer n, consider the set 
Sn = {e::|(z,e:)[? > ||2||?/n}. 


By Bessel’s inequality, S, contains at most n — 1 vectors. The con- 
clusion now follows from the fact that S = U®, S,. 


As our first application of this result, we prove the general form of 
Bessel’s inequality. 


Theorem C (Bessel’s Inequality). Jf {e;} 1s an orthonormal set in a 
Hilbert space H, then 

2| (z,e,)|? < |[x|l? (3) 
for every vector x in H. 


PROOF. Our basic obligation here is to explain what is meant by the 
sum on the left of (3). Once this is clearly understood, the proof is 
easy. As in the preceding theorem, we write S = {e;:(z,e:) #0}. If S 
is empty, we define 2|(z,¢;)|? to be the number 0; and in this case, (3) is 
obviously true. We now assume that S is non-empty, and we see from 
Theorem B that it must be finite or countably infinite. If S is finite, it 
can be written in the form S = {e:, e:, ..., én} for some positive 
integer n. In this case, we define 2|(z,e;)|2 to be 27, |(z,e,)|?, which is 
clearly independent of the order in which the elements of S are arranged. 
The inequality (3) now reduces to (1), which has already been proved. 
All that remains is to consider the case in which S is countably infinite. 
Let the vectors in S be arranged in a definite order: 


S = {e1, 2, ...,@n, fe 


By the theory of absolutely convergent series, if D7_, |(z,en)|? converges, 
then every series obtained from this by rearranging its terms also con- 
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verges, and all such series have the same sum. We therefore define 
z|(z,e,)|2 to be Z2_, |(z,en)|2, and it follows from the above remark that 
2|(z,e:)|2 is a non-negative extended real number which depends only 
on S, and not on the arrangement of its vectors. We conclude the proof 
by observing that in this case, (3) reduces to the assertion that 


D Ieewl? < lel (4) 
n= 


and since it follows from (1) that no partial sum of the series on the left 
of (4) can exceed ||z||?2, it is clear that (4) itself is true. 


The second part of Theorem A is generalized in essentially the 
same way. 


Theorem D. If {e;} 1s an orthonormal set in a Hilbert space H, and if x 
is an arbitrary vector in H, then 


xz — La,e)e: 1 e; (5) 
for each j. 
prooF. As in the above proof, we define 2(z,¢;:)e; for each of the various 
cases, and we prove (5) as we go along. We again write 


S = {e::(x,e:) ¥ 0}. 


When S is empty, we define 2(z,e;)e; to be the vector 0, and we observe 
that (5) reduces to the statement that z — 0 = zis orthogonal to each e;, 
which is precisely what is meant by saying that S is empty. When S 
is non-empty and finite, and can be written in the form 


S = {e1, 2, ..., en}, 


we define 2(x,e;)e; to be Zi_,(x,e,)e;; and in this case, (5) reduces to (2), 
which has already been proved. 

By Theorem B, we may assume for the remainder of the proof that 
S is countably infinite. Let the vectors in S be listed in a definite order: 
S = {e1, @, ..., €n,...$. We put s, = 22, (2,e,)e;,, and we note 
that for m > n we have 


™m m 
Im — sal? =I] Sede? = >> [eal 
f=n+l t=nt+1 
Bessel’s inequality shows that the series 27_, |(z,en)|? converges, so 
{s,} is a Cauchy sequence in H; and since H is complete, this sequence 
converges to a vector s, which we write in the form s = 27_, (2,en)én. 
We now define 2(z,e,)e; to be Z7_, (x,en)én, and—deferring for a moment 
the question of what happens when the vectors in S are rearranged—we 
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(x —* Z(z,e:)e, e) 7 (x — §, é;) pi (z,€;) _ (s,e;) = (x,e;) Js (lim Sn, 7) 
= (z,e;) — lim (s,,¢;) = (,e;) — (2,e) = 0. 


All that remains is to show that this definition of 2(z,e,)e; is valid, in the 
sense that it does not depend on the arrangement of the vectors in S. 
Let the vectors in S be rearranged in any manner: 


S = {fi, fo, cee Fn; ad fs 


We put s, = 2%, (2,f)fi, and we see—as above—that the sequence 
{s’} converges to a limit s’, which we write in the form s’ = Z7_, (2,fn)fa. 
We conclude the proof by showing that s’ equals s. Let « > 0 be given, 
and let no be a positive integer so large that if n > mo, then ||s, — s|| < e, 
Ils, — 8’ll <e, and 2241 [(z,e)|? <<. For some positive integer 
mo > No, all terms of s,, occur among those of s;,,, 80 8, — 8», is a finite 
sum of terms of the form (z,e,e; for 7 =m) +1, m+2,.... This 
yields lls), — Sull® < BEast |(z,e)|? < e2, 80 |[8%, — &l| < eand 


IIs’ — sl] < lls’ — sal] + Is, — Snll + [l8n, — sl] <€ + € + € = 3e. 
Since ¢ is arbitrary, this shows that s’ = s. 


Let H be a non-zero Hilbert space, so that the class of all its ortho- 
normal sets is non-empty. This class is clearly a partially ordered set 
with respect to set inclusion. An orthonormal set {e;} in H is said to be 
complete if it is maximal in this partially ordered set, that is, if it is 
impossible to adjoin a vector e to {e;} in such a way that {ee} is an 
orthonormal set which properly contains {e;}. 


Theorem E. Every non-zero Hilbert space contains a complete orthonormal 
set. 

proor. The statement follows at once from Zorn’s lemma, since the 
union of any chain of orthonormal sets is clearly an upper bound for the 
chain in the partially ordered set of all orthonormal sets. 


Orthonormal sets are truly interesting only when they are complete. 
The reasons for this are presented in our next theorem. 


Theorem F. Let H be a Hilbert space, and let {e;} be an orthonormal set in 
H. Then the following conditions are all equivalent to one another: 

(1) {es} ¢s complete; 

(2) a2 1 {e;} =2 =0; 

(3) «af xis an arbitrary vector in H, then x = 2(z,e,)ex; 

(4) if x ts an arbitrary vector in H, then ||z\|? = 2|(x,e:)[?. 
proor. We prove that each of the conditions (1), (2), and (3) implies 
the one following it and that (4) implies (1). 
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(1) = (2). If (2) is not true, there exists a vector x ¥ 0 such that 
az L {e;}. We now define e by e = z/|[z||, and we observe that {e;,e} 
is an orthonormal set which properly contains {e;}. This contradicts the 
completeness of {e;}. 

(2) = (3). By Theorem D, x — 2(z,e,e; is orthogonal to {e}, so 
(2) implies that « — Z(z,e,)e; = 0, or equivalently, that x = Z(z,e,)e:. 

(3) = (4). By the joint continuity of the inner product, the expres- 
sion in (3) yields 


lac]? = (x,2) _ (2 (x,e:)e:, 2 (x,e;)e;) = X(z,e:) (2,e:) = >| (x,e;)|2. 


(4) => (1). If {e:} is not complete, it is a proper subset of an 
orthonormal set {e;,e}. Since e is orthogonal to all the e,’s, (4) yields 
llel|? = =| (e,e,)|2 = 0, and this contradicts the fact that e is a unit vector. 


There is some standard terminology which is often used in connection 
with this theorem. Let {e;} be a complete orthonormal set in a Hilbert 
space H, and let x be an arbitrary vector in H. The numbers (z,¢;) are 
called the Fourier coefficients of x, the expression x = Z(z,e,)e; is called 
the Fourier expansion of x, and the equation ||z]|? = =|(z,e,)!? is called 
Parseval’s equation—all with respect to the particular complete ortho- 
normal set {e;} under consideration. These terms come from the 
classical theory of Fourier series, as indicated in our next example. 


Example 3. Consider the Hilbert space L2 associated with the measure 
space [0,27], where measure is Lebesgue measure and integrals are 
Lebesgue integrals! This space essentially consists of all complex 
functions f defined on [0,27] which are Lebesgue measurable and square- 
integrable, in the sense that 


Jy U@l ae < ©. 


Its norm and inner product are defined by 


Isl = Cf." Lye)? ax) 


2x —— 
and (f9) = J." S@)aCe) de. 
A simple computation shows that the functions e*"*, for 
n=0, +1, +2,..., 


1In order to understand this and the next example, the reader should have some 
knowledge of the modern theory of measure and integration. We wish to emphasize 
once again that these examples are in no way essential to the structure of the book, 
and may be skipped by any reader without the necessary background. We 
advise such a reader to ignore these examples and to proceed at once to the discussion 
of the Gram-Schmidt process. 
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are mutually orthogonal in Ly: 


2a 
i eimze—inz dz = 0 mAn 
f) 2a m=n. 
It follows from this that the functions e, (n = 0, +1, +2, . . .) defined 
by en(x) = e”*/+4/2x form an orthonormal set in I». For any function 
f in Lz, the numbers 


on = (fen) = te [” prayess dv (6) 
=a A 


are its classical Fourier coefficients, and Bessel’s inequality takes the form 


Y lol? < fs)!" ae. 
It is a fact of very great importance in the theory of Fourier series that 
the orthonormal set {e,} is complete in Zz. As we have seen in Theorem 
F, the completeness of {e,} is equivalent to the assertion that for every 
f in Lz, Bessel’s inequality can be strengthened to Parseval’s equation: 


; 2 {* 2 
Dy led? = fo" @l ae. 
Theorem F also tells us that the completeness of {e,} is equivalent to the 
statement that each f in L; has a Fourier expansion: 


1 
V/ 28 nate 
It must be emphasized that this expansion is not to be interpreted as 
saying that the series converges pointwise to the function. The meaning 
of (7) is that the partial sums of the series, that is, the vectors f, in L: 
defined by 


cne'”®, (7) 


tess 
Vln k=—n 


converge to the vector f in the sense of Lz: 


[fa — fll — 0. 


This situation is often expressed by saying that f is the limzt in the mean 
of the f,’s. We add one final remark to our description of this portion of 
the theory of Fourier series. If f is an arbitrary function in L2 with 
Fourier coefficients c, defined by (6), then Bessel’s inequality tells us that 
the series Z7__.. |cn]? converges. The celebrated Riesz-Fischer theorem 
asserts the converse: if c,(m =0, +1, +2,.. .) are given complex 
numbers for which Z?__,. |ca|? converges, then there exists a function 


f.(z) = 


Cc pe*e, (8) 
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f in L, whose Fourier coefficients are the c,’s. If we grant the complete- 
ness of Le as a metric space, this is very easy to prove. All that is 
necessary is to use the c,’s to define a sequence of f,’s in accordance with 
(8). The functions e**/+/2z7 form an orthonormal set, so for m > n we 
have 
m 

IIfm — fall? = > _ lel (9) 
By the convergence of 2°__,, |c,|2, the sum on the right of (9) can be made 
as small as we please for all sufficiently large n and all m>n. This 
tells us that the f,’s form a Cauchy sequence in ZL; and since Lz is com- 
plete, there exists a function f in L. such that f, > f. This function f is 
given by (7), and the c,’s are clearly its Fourier coefficients. It is 
apparent from these remarks that the essence of the Riesz-Fischer 
theorem lies in the completeness of L: as a metric space. 


We shall have use for one further item in the general theory of 
orthonormal sets, namely, the Gram-Schmidt orthogonalization process. 
Suppose that {z:, 22, . .., Zn, . - -} is a linearly independent set in a 
Hilbert space H. The problem is to exhibit a constructive procedure for 
converting this set into a corresponding orthonormal set {e1, é2, . 
én, . . .} with the property that for each n the linear subspace of H 
spanned by {e1, 2, ... , én} is the same as that spanned by {21, 22, 

..,2n}. Our first step is to normalize z,;—which is necessarily non- 
zero—by putting 


— 7 . 
~ [all 
The next step is to subtract from z2 its component in the direction of e: to 
obtain the vector 22 — (%2,€1)e: orthogonal to e:, and then to normalize 
this by putting 
— (x2,e1)e1 


é@ ae 
: |z2 — (x2,e1)e1]| 


We observe that since x2 is not a scalar multiple of z:, the vector x. — 
(x2,e1)e1 is not zero, so the definition of e2 is valid. Also, it is clear that 
€2 is a linear combination of x; and x2, and that z2 is a linear combination 
of e, and e2. The next step is to subtract from x; its components in the 
directions of e; and e2 to obtain a vector orthogonal to e; and e2, and then 


to normalize this by putting 
Pree Zz — (X3,€1)e1 — (2s,€2)e2 i 
> Yzs — (@s,e1)er — (23,€2)ea]| 


If this process is continued in the same way, it clearly produces an 
orthonormal set {¢:, €2, . . . , én, - . -} with the required property. 
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Example 4. Many orthonormal sets of great interest and importance 
in analysis can be obtained conveniently by applying the Gram-Schmidt 
process to sequences of simple functions. 

(a) In the space Ly associated with the interval [—1,1], the func- 
tions x" (n = 0, 1, 2, . . .) are linearly independent. If we take these 
functions to be the z,’s in the Gram-Schmidt process, then the e,’s are 
the normalized Legendre polynomials. 

(6) Consider the space Lz over the entire real line. If the z,’s 
here are taken to be the functions x"e—*/? (n = 0, 1, 2, . . .), then the 
corresponding eé,’s are the normalized Hermite functions. 

(c) Consider the space ZL. associated with the interval [0,+ 0). 
If the z,’s are the functions x"e-* (n = 0, 1, 2, . . .), then the e,’s are 
the normalized Laguerre functions. 


Each of the orthonormal sets described in the above example can be 
shown to be complete in its corresponding Hilbert space. The analysis 
involved in a detailed study of these matters is quite complicated and has 
no proper place in the present book. The reader should recognize, 
however—and this is our only reason for mentioning the material in 
Examples 3 and 4—that the theory of Hilbert spaces does have significant 
contacts with many solid topics in analysis. 


Problems 
1. Let {e1, e2, . .., en} bea finite orthonormal set in a Hilbert space 
H, and let x be a vector in H. If a, a, ..., a are arbitrary 


scalars, show that ||z — Dae a,e4 attains its minimum value = 
’ t=1 
a= (2z,e:) 


for each 7. (Hint: expand ||z — 22, axe,||?, add and subtract 22, 
|(z,e:)|2, and obtain an expression of the form 27, |(z,e:) — a{? in 
the result.) 

2. Show that the orthonormal sets described in Examples 1 and 2 are 
complete. 

3. Show that every orthonormal set in a Hilbert space is contained in 
some complete orthonormal set, and use this fact to give an alter- 
native proof of Theorem 53-B. 

4, Prove that a Hilbert space H is separable = every orthonormal set in 
H is countable. 

5. Show that an orthonormal set in a Hilbert space is linearly inde- 
pendent, and use this to prove that a Hilbert space is finite-dimen- 
sional — every complete orthonormal set is a basis. 

6. Prove that any two complete orthonormal sets in a Hilbert space H 
have the same cardinal number. This cardinal number is called the 
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orthogonal dimension of H (if H has no complete orthonormal sets, its 
orthogonal dimension is said to be 0). 

If H and H’ are Hilbert spaces, prove that H is isometrically iso- 
morphic to H’ « they have the same orthogonal dimension. (Hint: 
by Eq. 52-(2), an isometric isomorphism T' preserves inner products, 
in the sense that (7(x),T(y)) = (2,y).) 

Let S be a non-empty set, and let 1.(S) be the set of all complex 
functions f defined on S with the following two properties: 

(1) {s:f(s) ¥ 0} is empty or countable; 

(2) 2|f(s)|? < @. 

These functions clearly form a complex linear space with respect to 
pointwise addition and scalar multiplication. Show that 1,(S) 
becomes a Hilbert space if the norm and inner product are defined by 
fll = (ZIF(s)|?)% and (7,9) = Zf(s)g(s). Show also that the set of 
all functions defined on S which have the value 1 at a single point 
and are 0 elsewhere is a complete orthonormal set in /,(S). We shall 
see in the next problem that Hilbert spaces of the type described 
here are universal models for all non-zero Hilbert spaces. 

Let S = {e;} be a complete orthonormal set in a Hilbert space H. 
Each vector x in H determines a function f defined on S by 


S(es) = (2,¢4), 


and Theorems B and C tell us that f is in 12.(S). Show that the 
mapping x — f is an isometric isomorphism of H onto 1,(S). 


THE CONJUGATE SPACE H* 


We pointed out in the introduction to this chapter that one of the 


fundamental properties of a Hilbert space H is the fact that there is a 
natural correspondence between the vectors in H and the functionals in 


H*. 


Our purpose in this section is to develop the features of this corre- 


spondence which are relevant to our work with operators in the rest of the 
chapter. 


Let y be a fixed vector in H, and consider the function f, defined 


on by f,(x) = (z,y). It is easy to see that f, is linear, for 


and 


Sula: + x2) = (a1 + 22, y) 
(21,y) + (x2,y) 
fy(x1) + f,(a2) 
(ax,y) 

a(z,y) 

of, (z). 


Fi(ox) 
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Further, f, is continuous and is therefore a functional, for Schwarz’s 
inequality gives 
\fvz)| = |(,y)| 
< (ell llyll, 


which shows that ||f,|| < |ly|]. Even more, equality is attained here, 
that is, ||f,|| = |ly||. This is clear if y = 0; and if y = 0, it follows from 


Ilfull = sup {|fr(x)|: [lel] = 1} 


(im) 
= (2) = |lyll- 


To summarize, we have seen that y — f, is a norm-preserving mapping of 
H into H*. This observation would be of no more than passing interest 
if it were not for the fact that every functional in H* arises in just this way. 


iV 


Theorem A. Let H be a Hilbert space, and let f be an arbitrary functional 
in H*. Then there exists a unique vector y in H such that 


f(z) = wy) (1) 
for every xin H. 
prooF. It is easy to sce that if such a y exists, then it is necessarily 
unique. For if we also have f(x) = (z,y’) for all z, then (2,y’) = (2,y) 
and (2x, y’ — y) = 0 for all x; and since 0 is the only vector orthogonal 
to every vector, this implies that y’ — y = Oory’ = y. 

We now turn to the problem of showing that y does exist. If f = 0, 
then it clearly suffices to choose y = 0. We may therefore assume that 
f #0. The null space M of f is thus a proper closed linear subspace of H, 
and by Theorem 53-B, there exists a non-zero vector yo which is orthogo- 
nal to M. We show that if a is a suitably chosen scalar, then the vector 
y = ayo meets our requirements. We first observe that no matter what 
a may be, (1) is true for every x in M; for f(z) = 0 for such an z, and 
since z is orthogonal to yo, we also have (z,y) = 0. This allows us to 
focus our attention on choosing @ in such a way that (1) is true for 
x = yo. The condition this imposes on a is that 


f(yo) = (yo,eyo) = &llyoll?. 


We therefore choose a to be f(ys)/|lyol|?, and it follows that (1) is true for 
every x in M and for x = yo. It is easily seen that each z in H can be 
written in the form x = m + yo with m in M: all that is necessary is 
to choose 6 in such a way that f(x — Byo) = f(z) — Bf(yo) = 0, and this 
is accomplished by putting 6 = f(x)/f(yo). Our conclusion that (1) is 
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true for every z in H now follows at once from 


f(x) = f(m + Byo) = fim) + Bf(yo) = (my) + B(yo,y) 
= (m + Byo, y) = (x,y). 


This result tells us that the norm-preserving mapping of H into H* 
defined by 


y => fy, where f(x) = (x,y), (2) 


is actually a mapping of H onto H*, It would be pleasant if (2) were 
_also a linear mapping., This is.not quite true, however, for 


Sutn =n + iu and fay = Oy. (3) 


It is an easy consequence of (3) that the mapping (2) is an isometry, for 
lf- — Sull = \lfe-v|| = llc — yll. We state several interesting additional 
facts about this mapping (and what it enables us to do) in the problems, 
and we leave their verification to the reader. It should be remembered, 
however, that the real significance of this entire circle of ideas lies in its 
influence on the theory of the operators on H. We begin the treatment 
of these matters in the next section. 


Problems 


1. Verify relations (3). 

2. Let H be a Hilbert space, and show that H* is also a Hilbert space 
with respect to the inner product defined by (fz,f,) = (y,x). In just 
the same way, the fact that H* is a Hilbert space implies that H** is 
a Hilbert space whose inner product is given by (F;,F,) = (g,f). 

3. Let H be a Hilbert space. We have two natural mappings of H 
into H**, the second of which is onto: the Banach space natural 
imbedding x— F,, where F.(f) = f(z), and the product mapping 
t—f.— F;,, where f(y) = (y,2) and F;,(f) = (f,fz). Show that 
these mappings are equal, and conclude that H is reflexive. Show 
also that (F.,F,) = (z,y). 


56. THE ADJOINT OF AN OPERATOR 


Throughout the rest of this chapter, we focus our attention on a 
fixed but arbitrary Hilbert space H, and unless we specifically state 
otherwise, it is to be understood that H is the context for all our discus- 
sions and theorems. 

Let T be an operator on H. We saw in Sec. 51 that T gives rise to 
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an operator 7'* {its conjugate) on H*, where T* is defined by 
(T*f)x = f(Tz).* 


We also saw that the mapping T — T* is an isometric isomorphism of 
@(H) into @(H*) which reverses products and preserves the identity 
transformation. In the same way, T* 
gives rise to an operator T** on H**; 
and since H is reflexive, it follows that 
T** = T when H** is identified with 
H by means of the natural imbedding. 
These statements depend only on 
the fact that H is a reflexive Banach 
space. We now bring its Hilbert 
space character into the picture, and 
we use the natural correspondence 
between H and H* discussed in the 
previous section to pull 7* down to H. 
The details of this procedure are as 
follows (see Fig. 38). Let y be a vec- 
tor in H, and f, its corresponding 
functional in H*; operate with T* on Fig. 38. The conjugate and the 
f, to obtain a functional f, = T*f,;and adjoint of T. 
return to its corresponding vector z 
in H. There are three mappings under consideration here, and we are 
forming their product: 


y>foT*f, = fr z. (1) 


We write z = T*y, and we call this new mapping 7* of H into itself the 
adjoint of T. The same symbol is used for the adjoint of 7 as for its 
conjugate because these two mappings are actually the same if H and H* 
are identified by means of the natural correspondence. It is easy to 
keep track of whether T* signifies the conjugate or the adjoint of T by 
noticing whether it operates on functionals or on vectors. The action 
of the adjoint can be linked more closely to the structure of H by observ- 
ing that for every vector x we have (T*f,)z = f,(Tx) = (Tz,y) and 
(T*f,)x = f.(z) = (zz) = (2,T*y), so that 


(Tz,y) = (2,T*y) (2) 
forallzand y. Equation (2) ismuch more than merely a property of the 


1In working with operators, it is common practice to omit parentheses whenever 
it seems convenient. There is evidently no impairment of clarity in writing (T*f)z = 
f(Txr) instead of [T*(f)\(z) = f(T(2)), and there will be a considerable gain when we 
consider operators and inner products together, as we do below. 
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adjoint of JT, for it uniquely determines this adjoint. The proof is 
simple: if T’ is any mapping of H into itself such that (Tz,y) = (2,T’y) 
for all z and y, then (z,7’y) = (z,T*y) for all z, so T’y = T*y;! and 
since the latter is true for all y, T’ = T*. 

Our remarks in the above paragraph have shown that to each 
operator 7 on H there corresponds a unique mapping 7* of H into itself 
(called the adjoint of 7) which satisfies relation ‘(2) for all x and y. 
There is a more direct but less natural approach to these ideas, one which 
avoids any reference to the conjugate of T. If y is fixed, it is clear that 
the expression (7x,y) is a scalar-valued continuous linear function of zx. 
By Theorem 55-A, there exists a unique vector z such that (Tz,y) = (z,z) 
for all z. We now write z = T*y, and since y is arbitrary, we again have 
relation (2) for all z andy. The fact that 7'* is uniquely determined by 
(2) follows just as before. 

The principal value of our approach to the definition of the adjoint 
(as opposed to that just mentioned) lies in the motivation it provides for 
considering adjoints at all. We can express this by emphasizing that an 
operator on a Banach space always has a conjugate which operates on the 
conjugate space; and when the Banach space happens to be a Hilbert 
space, then, as we have seen, the natural correspondence discussed in the 
previous section makes it almost inevitable that we regard the conjugate 
as an operator on the space itself. Once the definition of the adjoint is 
fully understood, however, there is no further need to mention con- 
jugates. All our future work with adjoints will be based on Eq. (2), 
and from this point on, the symbol 7T* will always signify the adjoint of T 
(and never its conjugate). 

As our first step in exploring the properties of adjoints, we verify 
that 7* actually is an operator on H (all we know so far is that it maps 
A into itself). For any y and z, and for all z, we have 


(z, T*y + 2)) = (Tx, y + 2) = (Tz,y) + (Tz,2) 
= (@,T*y) + (@,T*z) = (@, Ty + T*z), 
so 


T*(y +2) = Tty + T*z. 
The relation T*(ay) = aT*y 


is proved similarly, so 7'* is linear. It remains to be seen that T* ig 
continuous; and to prove this, we note that 


|P*yl[? = (T*y,T*y) = (TT *y,y) < ||TT*yl| llyll < WTI 7 *yll lly 


1 The reasoning here depends on the fact that if y: and y2 are vectors such that 
(z,y1) = (2,Y2) for all z, then (z, y:1 — y2) = O for all z, so y: — ye = Oor yi = Ye. 
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implies that ||7*y|] < ||7'| \ly|| for all y, so 
IT*ll < ITI. 


These facts tell us that T — T* is a mapping of @(H) into itself. This 
mapping is called the adjoint operation on @(H). 


Theorem A. The adjoint operation T — T* on @(H) has the following 
properties: 

(1) (Ti+ T2)* = T1* + T2*; 

(2) (aT)* = aT*; 

(3) (TT 2)* = T2*T,*; 

(4) T**=T; 

(5) |I7*|] = ITI; 

(6) ||7*7|| = \|TII?. 
proor. The arguments used in proving (1) to (4) are all essentially 
the same. As an illustration of the method, we observe that (3) follows 
from the fact that for all z and y we have 


(z,(T1:T2)*y) = (TiTot,y) = (T22,Ti*y) = (2,7 .*Ty*y). 


To prove (5), we note that we already have ||7*|| < ||7'|l; and if we apply 
this to 7* instead of 7 and use (4), we obtain ||7'|| = ||7**|| < ||7*ll. 
Half of (6) follows from (5) and the inequality 47-(5), for 


IT*7 | < T*N TW = ITNT = WTI; 
and the fact that ||7'||? < |]7*7'|| is an immediate consequence of 
||Tx||? = (Tz,Tx) = (T*Tz,x) < ||T*Tz|| |le|| < ||T*7| [lel 


The presence of the adjoint operation is what distinguishes the 
theory of the operators on H from the more general theory of the operators 
on a reflexive Banach space.! In the next three sections, we use this 
operation as a tool by means of which we single out for special study 
certain types of operators on H whose theory is particularly complete and 
satisfying. 


Problems 


1. Prove parts (1), (2), and (4) of Theorem A. 
2. Show that the adjoint operation is one-to-one onto as a mapping of 
@(H) into itself. 


1See Kakutani and Mackey [23]. 
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3. Show that 0* = 0 and J* = J. Use the latter to show that if T is 
non-singular, then 7* is also non-singular, and that in this case 
(T*)-! = (T-})*. 

4, Show that ||77*|| = ||7]}. 


57. SELF-ADJOINT OPERATORS 


There is an interesting analogy between the set @(H) of all operators 
on our Hilbert space H and the set C of all complex numbers. This can be 
summarized by observing that each is a complex algebra together with a 
mapping of the algebra onto itself (7 — T* and z— 2) and that these 
mappings have similar properties. We shall see that this analogy is 
quite useful as an intuitive guide to the study of the operators on H. 
The most significant difference between these systems is that multi- 
plication in the algebra @(H) is in general non-commutative, and it will 
become clear as we proceed that this is the primary source of the much 
greater structural complexity of @(H). 

The most important subsystem of the complex plane is the real 
line, which is characterized by the relation z = 2. By analogy, we 
consider those operators A on H which equal their adjoints, that is, 
which satisfy the condition A = A*. Such an operator is said to be 
self-adjoint. The self-adjoint operators on H are evidently those which 
are related in the simplest possible way to their adjoints. 

We know that 0* = 0 and J* = IJ, so 0 and J are self-adjoint. It 
A, and A, are self-adjoint, and if a and @ are real numbers, then 


(a@A1 + BA2)* = @A,* + BA2* = aA + BA? 


shows that aA; + BAz is also self-adjoint. Further, if {A,} isa sequence 
of self-adjoint operators which converges to an operator A, then it is 
easy to see that A is also self-adjoint; for 


|A — A*]| < |A — A,l| + [l4, — A,*|] + []4.* — A*ll = A — Aa] 
+ ||(An — A)*l| = ||[A — Aall + [4a — All = 2\|4, — All 0 


shows that A — A* = 0, so A = A*. These remarks yield our first 
theorem. 


Theorem A. The self-adjoint operators in @(H) form a closed real linear 
subspace of @(H)—and therefore a real Banach space—which contains the 
identity transformation. 


The reader will notice that we have said nothing here about the 
product of two self-adjoint operators. Very little is known about such 
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products, and the following simple result represents almost the extent of 
our information. 


Theorem B. If A: and Az are self-adjoint operators on H, then their 
product A,Az. ts self-adjoint = A,A2 = AeA). 
proor, This is an obvious consequence of 


(Ai1A2)* = Ao*Ai* = AeA. 


The order properties of self-adjoint operators are more interesting, 
and we devote the remainder of the section to establishing some of the 
simpler facts in this direction. 

If T is an arbitrary operator on H, it is easy to see that 


T =0¢(Tz,y) = 0 


for allz and y. It is also clear that T = 0 = (Tz,x) = Ofor allz. We 
shall need the converse of this implication. 


Theorem C. Jf T ts an operator on H for which (Tx,x) = 0 for all x, 
then T = 0. 

prooF. It suffices to show that (Tz,y) = 0 for any z and y, and 
the proof of this depends on the following easily verified identity: 


(T(ax + By), ax + By) — |al?(Tx,x) — |B|?(Ty,y) 
= oB(Tz,y) + aB(Ty,z). (1) 


We first observe that by our hypothesis, the left side of (1)—and therefore 
the right side as well—equals 0 for all a and g. If we put a = 1 and 
= 1, then (1) becomes 


(Tz,y) + (Ty,x) = 0; (2) 
and if we put a = 7 and B = 1, we get 
uTz,y) — i(Ty,t) = 0. (3) 


Dividing (3) by < and adding the result to (2) yields 2(Tzx.y) = 0, so 
(Tx,y) = 0 and the proof is complete. 


It is worth emphasizing that this proof makes essential use of the 
fact that the scalars are the complex numbers (and not merely the real 
numbers). 

We now apply this result to proving our next theorem, which 
indicates that self-adjoint operators are linked to real numbers by stronger 
ties than might be suspected from the loose analogy that led to their 
definition. 
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Theorem D. An operator T on H is self-adjoint = (Tx,x) is real for all 2. 
proor. If T is self-adjoint, then 


(Tz,x) = (2,Tx) = (@,T*x) = (Tz,z) 


shows that (7x,x) is real for all x. On the other hand, if (7'z,x) is real 
for all x, then (Tz,r) = (Tx,x) = (2,T*x) = (T*2,z) or 


(IT — T*]z, z) = 0 
for all z. By Theorem C, this implies that 7 — 7* = 0, so T = T*. 


This theorem enables us to define a respectable and useful order 
relation on the set of all self-adjoint operators. If A; and Az: are self- 
adjoint, we write A: < Ae if (Aiz,r) < (Aoz,z) for all z The main 
elementary facts about this relation are summarized in 


Theorem E. The real Banach space of all self-adjoint operators on H is a 
partially ordered set whose linear structure and order structure are related by 
the following properties: 

(1) af Ai < Ao, then Ay + A < Ao +A for every A; 

(2) af Ay < Ao and a > 0, then aA, < aA». 
proor. The relation in question is obviously reflexive and transitive 
(see Sec. 8). To show that it is also antisymmetric, we assume that 
A; < Az and A, < A;. This implies at once that ([A1 — A2}z, xr) = 0 
for all x, so by Theorem C, Ai — Az = 0 and A; = Ag. The proofs of 
properties (1) and (2) are easy. For instance, if Ai < As, so that 
(Aiz,2) < (Aox,x) for all x, then (Aiz,x) + (Az,x) < (A2x,x) + (Az,2) 
or ([A; + A]z, x) < ((A2 + AJz, x) for all z, so Ai +A < A2 +A. 
The proof of (2) is similar. 


A self-adjoint operator A is said to be positive if A > 0, that is, if 
(Az,x) > 0 for all xz. It is clear that 0 and J are positive, as are T*T 
and TT* for an arbitrary operator T. 


Theorem F. If A is a positive operator on H, then I + A is non-singular. 
In particular, I + T*T and I + TT* are non-singular for an arbitrary 
operator T on H. 

PRooF. We must show that J + A is one-to-one onto as a mapping 
of H into itself. First, it is one-to-one, for 


(+ A)z = 05 Az = —2 => (Azz) = (—2,2) = —|zl? > O> 2 = 0. 


We next show that the range M of I+ A is closed. It follows from 
| + A)z||? = |x|]? + ||Aal]? + 2(Az,rz)—and the assumption that A is 
positive—that ||z|| < ||(7 + A)z||. By this inequality and the com- 
pleteness of H, M is complete and therefore closed. We conclude the 
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proof by observing that M = H; for otherwise there would exist a non- 
zero vector 2» orthogonal to M, and this would contradict the fact that 
(xo, [I + Ajzo) =0> \|zol|? = — (A29,20) <0> 2, = 0. 


If the reader wonders why we fail to show that the partially ordered 
set of all self-adjoint operators is a lattice, the reason is simple: it isn’t 
true. Asa matter of fact, this system is about as far from being a lattice 
as a partially ordered set can be, for it can be shown that two operators 
in the set have a greatest lower bound = they are comparable. This 
whole situation is intimately related to questions of commutativity for 
algebras of operators and is too complicated for us to explore here. For 
further details, see Kadison [22]. 


Problems 


1. Define a new operation of ‘multiplication’ for self-adjoint operators 
by Ai° Az = (AiA2 + A2A1)/2, and note that Ai° A: is always 
self-adjoint and that it equals A,A2 whenever A; and A: commute. 
Show that this operation has the following properties: 


Ai° As = Az° Aj, 
A,° (Az + As) = A1° Aa + A1° Az, 
a(A1° Ae) = (a@A1) oA, = Ai° (As), 


and AceJ =IeA =A. Showalso that A1°(A2°A3) = (A1°A2)°A; 
whenever A; and A; commute. 

2. If T is any operator on H, it is clear that |(T7,z)| < ||Tz|l |lzl| < 
I7'l \|zll?; so if H {0}, we have sup {|(7z,z)|/|lz|]?:2 4 0} < |ITI. 
Prove that if 7 is self-adjoint, then equality holds here. (Hint: 
write a = sup {|(7z,zx)|/||z||?:2 4 0} = sup {|(Tz,zx)|:|lzl] = 1}, and 
show that ||7zx|| < a whenever ||z|| = 1 by putting 6 = ||Tz||*—if 
Tx ~ O0—and considering 


4||Tx\|? = (T(bez + b-'T x), ba + b-'T x) 
— (T(bz — b-'Tx), bx — b-'Tzx) < alllbx + b-'T2||? 
+ |lbx — b-'Ta||*] = 4a Tz|l.) 


58. NORMAL AND UNITARY OPERATORS 


An operator N on H is said to be normal if it commutes with its 
adjoint, that is, if NN* = N*N. The reason for the importance of 
normal operators will not become clear until the next chapter. We shall 
see that they are the most general operators on H for which a simple and 
revealing structure theory is possible. Our purpose in this section is to 
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present a few of their more elementary properties which are necessary 
for our later work. 

It is obvious that every self-adjoint operator is normal, and that if 
N is normal and a is any scalar, then aN is also normal. Further, the 
limit N of any convergent sequence {N;.} of normal operators is normal; 
for we know that N,* — N*, so 


|NN* — N*N|| < ||NN* — NiN,*|| + ||NeNe* — Me* Nill 
+ |NitN, — N*N|| = ||NN* — N.N,*|| + |Ni*Ni — N*N]| > 0, 


which implies that NN* — N*N = 0. These remarks prove 


Theorem A. The set of all normal operators on H is a closed subset of 
@(H) which contains the set of all self-adjoint operators and is closed under 
scalar multiplication. 


It is natural to wonder whether the sum and product of two normal 
operators are necessarily normal. They are not, but nevertheless, we can 
say a little in this direction. 


Theorem B. If N, and N2 are normal operators on H with the property 
that either commutes with the adjoint of the other, then N: + Nz and NiN, 
are normal. 


proor. It is clear by taking adjoints that 
NiN.* = N2*N1 & N2N* = Ni*N2, 


so the assumption implies that each commutes with the adjoint of the 
other. To show that N, + N2 is normal under the stated conditions, we 
have only to compare the results of the following computations: 


(Ni + N2)(Ni + N2)* = (Ni + N2)(Ni* + N2*) 
= NiN,* + N,N2* + N2Ni* + N2N2* 

and (Ni + N2)*(Ni + Ne) = (Mi* + N2*)(Ni + N2) 
= Ni*N, + Ni*Neo + N2*N, + N2*N2. 


The fact that NiN; is normal follows similarly from 


N1N2(NiN2)* = N,N.N2*N,* ax: N\N2*N.N,* = N.*N\N,*N2 
— N2*Ny*NN:? — (NiN2)*NN2. 


By definition, a self-adjoint operator A is one which satisfies the 
identity A*z = Az. Many properties of self-adjoint operators do not 
depend on this, but only on the weaker identity ||A*z|| = ||Az||. Our 
next theorem shows that all such properties are shared by normal 
operators. 
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Theorem C. An operator T on H is normal © ||T*zx|| = ||Tz|| for every x. 
PROOF. In view of Theorem 57-C, this is implied by the fact that 


|7*z2|| = Tal] = || 7*x|]? = || Tx||? — (T*2,7*z) 
= (7V'2,Tx) = (TT*2,2) = (T*T 2,2) = ([TT* — T*T]z,x) = 0. 


The following consequence of this result will be useful in our later 
work. 


Theorem D. If N is a normal operator on H, then ||N?|| = ||N|l?. 
proor. The preceding theorem shows that 


|\N2x|| = |VNzl| = ||N*Ne| 


for every x, and this implies that ||N|| = ||N*N||. By Theorem 56-A, we 
have ||V*N|| = ||N||?, so the proof is complete. 


We know that any complex number z can be expressed uniquely in 
the form z = a + 7b where a and 6 are real numbers, and that these real 
numbers are called the real and imaginary parts of z and are given by 
a=(e+2)/2 and b= (z—2)/2i. The analogy between general 
operators and complex numbers, and between self-adjoint operators and 
real numbers, suggests that for an arbitrary operator 7 on H we form 
Ai = (T + T*)/2 and Az = (T — T*)/2i. A; and A, are clearly self- 
adjoint, and they have the property that T = A: + 7A». The unique- 
ness of this expression for T follows at once from the fact that 


T* = A, — tAe. 


The self-adjoint operators A, and Az: are called the real part and the 
tmaginary part of T. 

We emphasized earlier that the complicated structure of @(H) is 
due in large part to the fact that operator multiplication is in general 
non-commutative. Since our future work will be focused mainly on 
normal operators, it is of interest to see—as the following theorem shows— 
that the existence of non-normal operators can be traced directly to the 
non-commutativity of self-adjoint operators. 


Theorem E. If T is an operator on H, then T is normal = its real and 
imaginary parts commute. 
proor. If A; and A, are the real and imaginary parts of T, so that 
T= Ai + tAe and TT? = Ai = tAa, then 

TT* = (Ai + tA2)(Ai — tA2) = Ai? + Ao? + t(A2A1 — A1A2) 
and 

T*T = (Ai — tA2)(Ai + tA2) = Ai? + Ao? + i(AiA2 — AeA). 
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It is clear that if A:A: = A2A1, then T7* = T*T. Conversely, if 
TT* = T*T, then AA; = A.A, = A2Ai = A,Az, so 2A1Ae = 2AcAi 
and A,Az _ A2A1. 


Perhaps the most important subsystem of the complex plane after 
the real line is the unit circle, which is characterized by either of the 
equivalent identities |z| = 1 or 22 = 2 =1. An operator U on H 
which satisfies the equation UU* = U*U =I is said to be unitary. 
Unitary operators—which are obviously normal— are thus the natural 
analogues of complex numbers of absolute value 1. It is clear from the 
definition that the unitary operators on H are precisely the non-singular 
operators whose inverses equal their adjoints. The geometric signifi- 
cance of these operators is best. understood in the light of our next theorem. 


Theorem F. If T is an operator on H, then the following conditions are all 
equivalent to one another: 

(1) T*T=T; 

(2) (Tx,Ty) = (2,y) for all x and y; 

(3) Zzl| = |lal| for all x. 
proor. If (1) is true, then (7*Tz,y) = (z,y) or (Tx,Ty) = (2,y) 
for all x and y, so (2) is true; and if (2) is true, then by taking y = x we 
obtain (Tz,Tx) = (z,xz) or ||Tz||? = ||z\|? for all z, so (3) is true. The 
fact that (3) implies (1) is a consequence of Theorem 57-C and the follow- 
ing chain of implications: 


|7z|| = |le|| => |||? = |x|]? => (Tz,Tx) = (2,2) = (T*T 2,2) 
= (2,2) = ([T*T — I]z,x) = 0. 


An operator on H with property (3) of this theorem is simply an 
isometric isomorphism of H into itself. That an operator of this kind 
need not be unitary is easily seen by considering the operator on J; 
defined by 


T {x1, U2, « « Bi = {0, V1, Ta, . - ot 
which preserves norms but has no inverse. These ideas lead at once to 


Theorem G. An operator T on H is unitary & it is an isometric iso- 
morphism of H onto itself. 

proor. If T is unitary, then we know from the definition that it is 
onto; and since by Theorem F it preserves norms, it is an isometric 
isomorphism of H onto itself. Conversely, if T is an isometric isomor- 
phism of H onto itself, then 7-1! exists, and by Theorem F we have 
T*T =I. It now follows that (7*7)T-! = IT-', so T* = T-) and 
TT* = T*T = I, which shows that T is unitary. 
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This theorem makes quite clear the nature of unitary operators: 
they are precisely those one-to-one mappings of H onto itself which 
preserve all structure—the linear operations, the norm, and the inner 
product. 


Problems 


1. If T is an arbitrary operator on H, and if a and 8 are scalars such 
that |a| = ||, show that a7 + 87* is normal. 

2. If H is finite-dimensional, show that every isometric isomorphism of 
H into itself is unitary. 

3. Show that an operator 7 on H is unitary = T({e,}) is a complete 
orthonormal set whenever {e,} is. 

4. Show that the unitary operators on H form a group. 


59. PROJECTIONS 


According to the definition given in Sec. 50, a projection on a Banach 
space B is an idempotent operator on B, that is, an operator P with the 
property that P? = P. It was proved in that section that each projec- 
tion P determines a pair of closed lincar subspaces M and N—the range 
and null space of P—such that B = M @ N, and also, conversely, that 
each such pair of closed linear subspaces M@ and N determines a projection 
P with range M and null space N. In this way, there is established a 
one-to-one correspondence between projections on B and pairs of closed 
linear subspaces of B which span the whole space and have only the zero 
vector in common. 

The context of our present work, however, is the Hilbert space H, 
and not a general Banach space, and the structure which H enjoys in 
addition to being a Banach space enables us to single out for special 
attention those projections whose range and null space are orthogonal. 
Our first theorem gives a convenient characterization of these projections. 


Theorem A. If P is a projection on H with range M and null space N, 
then M 1 N & P is self-adjoint; and in this case, N = M1. 

PRooF. Each vector z in H can be written uniquely in the form 
z2=a2+y with zand yin Mand WN. If M LN, so that x 1 y, then 
P* = P will follow by Theorem 57-C from (P*z,z) = (Pz,z); and this 
is a consequence of 


(P*z,z) = (@,Pz) = @,2) = (@ + y, 2) = (@,z) + (y,) = @,2) 
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and (Pz,z) = (2,z) = (z,2 + y) = (a,x) + (z,y) = (2,2). If, conversely, 
P* = P, then the conclusion that M 1 N follows from the fact that for 
any x and y in M and N we have 


(x,y) = (Px,y) = (2,P*y) = (2,Py) = (2,0) = 0. 


All that remains is to see that if M 1 N, then N = M+. It is clear 
that N C M+; and if N is a proper subset of M+, and therefore a proper 
closed linear subspace of the Hilbert space M+, then Theorem 53-B 
implies that there exists a non-zero vector zo in M+ such that 2) 1 N. 
Since z, 1 Mand z 1 N,and since H = M @ N, it follows that 2) 1 H. 
This is impossible, so we conclude that N = M1. 


A projection on H whose range and null space are orthogonal is 
sometimes called a perpendicular projection. The only projections con- 
sidered in the theory of Hilbert spaces are those which are perpendicular, 
so it is customary to omit the adjective and to refer to them simply as 
projections. In the light of this agreement and Theorem A, a projection 
on H can be defined as an operator P which satisfies the conditions 
P? = P and P* = P. The operators 0 and 7 are projections, and they 
are distinct = H ~ {0}. 

The great importance of the projections on H rests mainly on 
Theorem 53-D, which allows us to set up a natural one-to-one corre- 
spondence between projections and closed linear subspaces. To each 
projection P there corresponds its range M = {Px:x¢ H}, which is a 
closed linear subspace; and conversely, to each closed linear subspace M@ 
there corresponds the projection P with range M defined by P(x + y) = 2, 
where x and y are in M and M1. Either way, we speak of P as the 
projection on M. 

It is clear that P is the projection on M =I — P is the projection 
on M+. Also, if P is the projection on M, then 


zeMo Px =2€ ||Px\| = |e]. 

The first equivalence here was proved in Problem 44-11; and since for 
every z in H we have 

lle||? = |Pa + Z — P)all? = ||P2||? + || — P)zl, (1) 
the non-trivial part of the second is given by the following chain of 
implications: 

|P2|| = [lel] => ||Pz|l? = ||xl]? = |] — P)z||? = 0 Pz = z. 

Relation (1) also shows that ||Pz|| < |lz|| for every z, so ||P|| <1. Ifz 
is an arbitrary vector in H, it is easy to see that 


(P2z,2) = (PPz,z) = (Px,P*x) = (Px,Pz) = ||Pzx||? => 0, (2) 
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so P is a positive operator (0 < P) in the sense of Sec. 57. Since J — P 
is also a projection, we also have O <7 —PorP <I,s0<P <I. 

Jet T be an operator on H. A closed Jinear subspace M of H is 
said to be invariant under T if T(M) C M. When this happens, the 
restriction of T to M can be regarded as an operator on M alone, and 
the action of 7 on vectors outside of Mf can be ignored. If both M and 
M+ are invariant under 7, we say that M reduces T, or that T is reduced 
by M. This situation is much more interesting, for it allows us to 
replace the study of 7 as a whole by the study of its restrictions to M 
and M1, and it invites the hope that these restrictions will turn out to 
be operators of some particularly simple type. In the following four 
theorems, we translate these concepts into relations between 7 and the 
projection on M. 


Theorem B. A closed linear subspace M of H is invariant under an 
operator T = M+ is invariant under T*. 

PRooF. Since M11 =M and T** = 7, it suffices by symmetry to 
prove that if M is invariant under 7, then M+ is invariant under T*. 
If y is a vector in M+, our conclusion will follow from (z,7*y) = 0 for 
all z in M. But this is an easy consequence of (z,T*y) = (Tx,y), for 
the invariance of M under T implies that (Tz,y) = 0. 


Theorem C. A closed linear subspace M of H reduces an operator T= M 
is tnvariant under beth T and T*. 
proor, This is obvious from the definitions and the preceding theorem. 


Theorem D. If P is the projection on a closed linear subspace M of H, 
then M is invariant under an operator T= TP = PTP. 

proor. If M is invariant under T and z is an arbitrary vector in H, 
then TPx is in M, so PTPx = TPzr and PTP = TP. Conversely, if 
TP = PTP and z is a vector in M, then Tz = TPx = PTPzx is also in 
M, so M is invariant under T. 


Theorem E. If P ts the projection on a closed linear subspace M of H, then 
M reduces an operator T= TP = PT. 

pRooF. M reduces T = M is invariant under 7 and T* @ TP = PTP 
and T*P = PT*P = TP = PTP and PT = PTP. The last statement 
in this chain clearly implies that TP = PT; it also follows from it, as 
we see by multiplying TP = PT on the right and left by P. 


Our next theorem shows how projections can be used to express the 
statement that two closed linear subspaces of H are orthogonal. 


Theorem F. If P and Q are the projections on closed linear subspaces M 
and N of H, thn M LN@PQ=0e8QP =0. 
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PROOF. We first remark that the equivalence of PQ = 0 and QP = 0 
is clear by taking adjoints. If M 1 N, so that N C M+, then the fact 
that Qzx is in N for every z implies that PQz = 0, so PQ = 0. If, con- 
versely, PQ = 0, then for every x in N we have Px = PQz = 0, so 
NC Mand M LN. 


Motivated by this result, we say that two projections P and Q are 
orthogonal if PQ = 0. 

Our final theorem describes the circumstances under which a sum 
of projections is also a projection. 


Theorem G. If Pi, Po, ... , Pn are the projections on closed linear sub 
spaces M,, M2, ..., M, of H, thn P=P,+P.2+:+:+:+Phisa 
projection = the P,’s are patrwise orthogonal (in the sense that P;P; = 0 
whenever 1 # j); and in this case, P is the projection on 


M=Mi+M24+-:-:+M,. 


proor, Since P is clearly self-adjoint, it is a projection = it is idem- 
potent. If the P,’s are pairwise orthogonal, then a simple computation 
shows at once that P is idempotent. To prove the converse, we assume 
that P is idempotent. Let z be a vector in the range of P;, so that 
x =P. Then 


lll]? = ||Peall? < &, || Pja||? = 2, (Pyx,0) = (Pz,z) = ||Pz2l|? < |la|l?. 


We conclude that equality must hold all along the line here, so 


2, [Pelt = (IPs 
and ||Pjzl| =O forj #2. 


Thus the range of P; is contained in the null space of P;, that is, M; C M,1+, 
for every j #7. This means that M; 1 M, whenever i ~ j, and our 
conclusion that the P,’s are pairwise orthogonal now follows from the 
preceding theorem. We prove the final statement in two steps. First, 
we observe that since ||Pz|| = ||z|| for every z in M;, each M; is contained 
in the range of P, and therefore M is also contained in the range of P. 
Second, if x is a vector in the range of P, then 


2=Pe=Pwt+Paet: ++ + Prt 
is evidently in M. 


There are many other ways in which the algebraic structure of the 
set of all projections on H can be related to the geometry of its closed 
linear enhbsnaces. and several of these are given in the problems below. 
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The significance of projections in the general theory of operators on H 


is the theme of the next chapter. As we shall see, the essence of the 
matter (the spectral theorem) is that every normal operator is made of 
projections in a way which clearly reveals the geometric nature of its 
action on the vectors in H. 


Problems 


1. 


If P and Q are the projections on closed linear subspaces M and N 
of H, prove that PQ is a projection = PQ = QP. In this case, show 
that PQ is the projection on MC) N. 

If P and Q are the projections on closed linear subspaces M and N 
of H, prove that the following statements are all equivalent to one 
another: 

(a) P<Q; 

(6) ||Pzl| < \\Qz|| for every x; 

() MCN; 

dd) PQ=P; 

(e:) QP =P. 

(Hint: the equivalence of (a) and (6) is easy to prove, as is that of 
(c), (2), and (e); prove that (d) implies (a) by using 


(Pz,z) = ||Pz||? = ||PQz|}? < [|Qz||? = (Qz,z); 


and prove that (6) implies (c) by observing that if z is in M, then 
llz|| = [Pz] < Oz] < Ih.) 

Show that the projections on H form a complete lattice with respect 
to their natural ordering as self-adjoint operators. (Compare this 
situation with that described in the last paragraph of Sec. 57.) 

If P and Q are the projections on closed linear subspaces M and N 
of H, prove that Q — P is a projection = P <Q. In this case. 
show that Q — P is the projection on NM M4, 


CHAPTER ELEVEN 


Finite-dimensional Spectral Cheory 


If T is an operator on a Hilbert space H, then the simplest thing T 
can do to a vector z is to transform it into a scalar multiple of itself: 


Tx = de. (1) 


A non-zero vector z such that Eq. (1) is true for some scalar } is called 
an eigenvector of T, and a scalar \ such that (1) holds for some non-zero x 
is called an eigenvalue of T.! Each eigenvalue has one or more eigen- 
vectors associated with it, and to each eigenvector there corresponds 
precisely one eigenvalue. If H has no non-zero vectors at all, then 7 
certainly has no eigenvectors. In this case the whole theory collapses 
into triviality, so we assume throughout the present chapter that 
H # {0}. 

Let \ be an eigenvalue of 7, and consider the set M of all its corre- 
sponding eigenvectors together with the vector 0 (note that 0 is not an 
eigenvector). M is thus the set of all vectors z which satisfy the equation 


(T — \Dz = 0, 


and it is clearly a non-zero closed linear subspace of H. We call M the 
eigenspace of T corresponding to A. It is evident that M is invariant 
under 7 and that the restriction of 7 to M is a very simple operator, 
namely, scalar multiplication by X. 

In order to place the ideas of this chapter in their proper framework, 


1The equivalent terms characteristic vector and characteristic value, and proper 
vector and proper value, are used by many writers. 
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we lay down several rather sweeping hypotheses, whose validity we 
examine later: 

(a) T actually has eigenvalues, and there are finitely many of them, 
say a, Ae, - - - , Am—-Wwhich are understood to be distinct— 
with corresponding eigenspaces M,, Mo, .. . , Mn; 

(b) the M,’s are pairwise orthogonal, that is, i ~j = M; 1 M;; 

(c) the M,’s span H. 

Putting aside for a moment the question of whether these statements are 
true or not, we investigate their implications. By (6) and (ce), every 
vector x in H can be expressed uniquely in the form 


B@=Xit ete. +-++ + 2m, (2) 


where 2, is in M/; for each 7 and the z,’s are pairwise orthogonal. It now 
follows from (a) that 


Tx = Tx,+Tr.+:-:- + T2m 
=> N21 + Ate + SLES. + Am@m- (3) 


This relation exhibits the action of 7 over all of H in a manner which 
renders its structure perfectly clear from the geometric point of view. 
It will be convenient to express this result in terms of the projections 
P; on the eigenspaces M@;. By Theorem 59-I, (6) is equivalent to the 
following statement: 


the P,’s are pairwise orthogonal. (4) 


Also, since for each 7 and for every 7 #7 we have M; C M;,+, Eq. (2) 
yields 
P= Xs 


and it follows at once from this that 


Ix=x2=a,+22+°°++ +2 
Pye + Por + pend + Pv 
= (Pit Pe+:::+ + Pade 


for every x in H, so 
T=PitP2+ +++ +Pa. (5) 


Relation (3) now tells us that 


Tx = \it1 + Note + + + + + nlm 
= j\yPye + AePor + “eS + AmPmt 
(AiPi + AeP2 + + + + AmPm)t 


for every x, sO 
T 


i] 


AP + A2P2 + + + AmPm. (6) 
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The expression for T given by (6)—when it exists—is called the spectral 
resolution of T. Whenever this term is used, it is to be understood that 
the \,’s are distinct and that the P,’s are non-zero projections which 
satisfy conditions (4) and (5). We shall see later that the spectral 
resolution of 7 is unique when it exists. 

All our inferences from (a), (6), and (c) are perfectly rigorous, but 
the status of these three hypotheses remains entirely up in the air. First 
of all, with reference to (a), does an arbitrary operator J on H necessarily 
have an eigenvalue? The answer to this is no, as the reader will easily 
verify by considering the operator 7 on I, defined by 


T (x1, Lo, + + at = {0, U1, G2, . « Vs 


On the other hand, if H is finite-dimensional, then we shall see in Sec. 61 
that every operator has an eigenvalue. For this reason, we assume for 
the remainder of the chapter—unless we specifically state otherwise— 
that H is finite-dimensional with dimension n. 

We have seen that if 7 satisfies conditions (a), (b), and (c), then it 
has the spectral resolution (6). It is too much to hope that every opera- 
tor on H meets these requirements, so the question arises as to what 
restrictions they impose on T. This question is easy to answer: 7’ must 
be normal. For it follows from (6) that 


T* = XPi + X2P2 + aaa + XnPm; 
and by using (4) we readily obtain 
TT* = (Pit AePet + AmPm) (MP1 + X2P2 + meee + XnPm) 
= |Aa|?Pa + [A2|2P2 + Ra) a + [Xm|?Pn 


and, similarly, 
T*T = |dil2Pa + |rol2P2 + + + dml2Pme 


This entire circle of ideas will be completed in the neatest possible way 
if we can show that every normal operator on H satisfies conditions (a), 
(b), and (c), and therefore has a spectral resolution. Our aim in the 
present chapter is to prove this assertion, which is known as the spectral 
theorem, and the machinery treated in the following sections is directed 
exclusively toward this end. We emphasize once again that H is under- 
stood to be finite-dimensional with dimension n > 0. 


60. MATRICES 


Our first goal is to prove that every operator on H has an eigenvalue, 
and in pursuing this we make use of certain elementary portions of the 
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theory of matrices. We adopt the view that the reader is probably 
familiar with this theory to some degree and that it suffices here to give a 
brief sketch of its basic ideas. Our discussion in this section is entirely 
independent of the Hilbert space character of H and applies equally well 
to any non-trivial finite-dimensional linear space. 

Let B = {e1, ez, . . . , én} be an ordered basis for H, so that each 
vector in H is uniqucly expressible as a linear combination of the e,’s. 
If T is an operator on H, then for each e; we have 


Te; = >) ayer. (1) 
i= 
The n? scalars a;; which are determined in this way by T form the matrix 
of 7 relative to the ordered basis B. We symbolize this matrix by [T], 
or if it seems desirable to indicate the ordered basis under consideration, 
by [T]s. It is customary to write out a matrix as a square array: 


Q11 O12 Qin 
[T] = a1 a2 eee Qon iS (2) 
Qni Ane Ann. 
The array of scalars (ai, ai, . . - , Qin) is the ith row of the matrix [T], 


and (aij, a2, - - . , Qnj) is its jth column. As this terminology shows, 
the first subscript on the entry a, always indicates the row to which it 
belongs, and the second the column. In our work, we generally write 
(2) more concisely in the form 


[T] = [a]. (3) 


The reader should make sure that he has a perfectly clear understanding 
of the rule according to which the matrix of T is constructed: write Te; 
as a linear combination of ¢1, é2, . . . , én, and use the resulting coeffi- 
cients to form the jth column of [7]. 

We offer several comments on the above paragraph. First, the 
term matrix has not been defined at all, but only “the matrix of an 
operator relative to an ordered basis.” A matrix—defined simply as a 
square array of scalars—is sometimes regarded as an object worthy of 
interest in its own right. For the most part, however, we shall consider 
a matrix to be associated with a definite operator relative to a particular 
ordered basis, and we shall regard matrices as little more than computa- 
tional devices which are occasionally useful in handling operators. Next, 
the matrices we work with are all square matrices. Rectangular matrices 
occur in connection with linear transformations of one linear space into 
another and are of no interest to us here. Finally, we took B to be an 
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ordered basis rather than merely a basis, because the appearance of the 
array (2) clearly depends on the arrangement of the e,’s as well as on the 
e’s themselves. In most theoretical considerations, however, the order 
ef the rows and columns of a matrix is as irrelevant as the order of the 
vectors in a basis. For this reason, we usually omit the adjective and 
speak of ‘‘the matrix of an operator relative to a basis.” 

By using the fixed basis B = {e;}, we have assigned a matrix 
[T] = [a,j] to each operator 7 on H, and the mapping 7 — [T] from 
operators to matrices is described by Te; = 2?_, a.e;. The importance 
of matrices is based primarily on two facts: T — [T] is a one-to-one 
mapping of the set of all operators on H onto the set of all matrices; and 
algebraic operations can be defined on the set of all matrices in such a 
manner that the mapping 7 — [T] preserves the algebraic structure 
of @(H). 

The first of these statements is easy to prove. If we know that 
[a,j] is the matrix of 7, then this information fully determines Tx for 
every x; for if x = 2%, Bje;, then 


This shows that T — [T] is one-to-one. We see that this mapping is 
onto by means of the following reasoning: if [a,;] is any matrix, then 
Te; = 2, «,e; defines T for the vectors in B, and when T is extended 
by linearity to all of H, it is clear that the resulting operator has [a;,] as 
its matrix. 

To establish the second statement, it suffices to discover how to add 
and multiply two matrices and how to multiply a matrix by a scalar, in 
such a way that the following matrix equations are true for all operators 
Ty and T2 on H: [T, + T.| = [T,] + [T2], [aT] = alT,], and 


[T:T2] = [T1)[T?]. 
Let: [a;;] and [8;;] be the matrices of 7; and T,. The computation 
(T, + T2)e; —_ T 1e; + Te; 


7 > one; + > Bixei 
i= iz 
= > (ai, + Bie 


t=l 
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shows that if we define addition for matrices by 


[es] + [65] = [as + Be), (4) 
then we obtain [71 + Ts} = [71] + [72]. 
Similarly, if we multiply a matrix by a scalar in accordance with 
ala] = [aa,], (5) 
then [aT] = afT,]. 
Finally, the computation 
(TiT2)e; = Ti(T2e;) = T1 (>, Baste) 
kml 
, Bul 


ll 
3 eas 


3 Bi; (> ain€;) 


s=1 


> Co aiuBri) @ 


t=] 


shows that if we define multiplication for matrices by 


[alli = [ 3 anbrs]; (6) 
=] 
then we get [7:72] = [T,][T:]. 


The operations defined by (4), (5), and (6) are the standard algebraic 
operations for matrices. In words, we add two matrices by adding 
corresponding entries, and we multiply a matrix by a scalar by multiply- 
ing each of its entries by that scalar. The verbal description of (6) is 
more complicated, and is often called the row-by-column rule: to find the 
entry in the 7th row and jth column of the product [a;,][8.;], take the 7th 
row (ai1, a2, ... , Qin) of the first factor and the jth column (61;, 89;, 
» Bnj) of the second, multiply corresponding entries, and add: 


n 


> OinBes = B15 + 283 + - + + OinBnj. 
k= 
{t is worth noting that the image of the zero operator under the mapping 
T — [T] is the zero matriz, all of whose entries are 0. Further, it is 
equally clear that the image of the identity operator is the identity matriz, 
which has 1’s down the main diagonal (where i = 7) and 0’s elsewhere. 
If we introduce the standard Kronecker delta, which is defined by 
cs 0 if i #7 
omar | ifi = J, 


then the identity matrix can be written [4,;]. 
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We now reverse our point of view for a moment (but only a moment) 
and consider the set A, of all n X n matrices as an algebraic system in its 
own right, with addition, scalar multiplication, and multiplication defined 
by (4), (5), and (6). It can be verified directly from these definitions 
that A, is a complex algebra with identity (the identity matrix), called 
the total matrix algebra of degree n. If we ignore the ideas leading to 
(4), (5), and (6), then the structure of A, is defined, and can be studied, 
without any reference to its origin as a representing system for the opera- 
torson H. This approach would make very little sense, however, because 
the primary reason for considering matrices in the first place is that they 
provide a computational tool which is useful in treating certain aspects 
of the theory of these operators. 

Let us return to our original position and observe two facts: that 
@(H) is an algebra; and that the structure of A, is defined in just such a 
way as to guarantee that the one-to-one mapping J — [T] of @(H) onto 
A, preserves addition, scalar multiplication, and multiplication. It now 
follows at once that A, is an algebra, and that 7’ — [7] is an isomorphism 
(see Problem 45-4) of @(H) onto An. 

We give the following formal summary of our work so far. 


Theorem A. If B = {e;} is a basis for H, then the mapping T > [T], 
which assigns to each operator T its matrix relative to B, is an tsomorphism 
of the algebra @(H) onto the total matrix algebra An. 


If T is a non-singular operator whose matrix relative to B is [a,j], 
then 7-1 clearly has a matrix whose entries are determined in some way 
by the a;;’s. The formulas involved here are rather clumsy and compli- 
cated, and since they have no importance for us, we shall say nothing 
further about them. 

It is necessary, however, to know what is meant by the inverse of 
a matrix, when it is considered purcly as an element of A, and without 
reference to any operator which it may represent. We first remark that 
the identity matrix [8;] is casily seen by direct matrix multiplication to 
be an identity element for the algebra Aj, in the sense that we have 


Less] [8i;] = [5:j]ax,] = [oes] 
for every matrix [a;;]; and by the theory of rings, this identity is unique. 
A matrix [a;;] is said to be non-singular if there exists a matrix [@.;] such 
that 

[eis [Bi3) = [B:]losj] = [6:5]; 
and, again by the theory of rings, if such a matrix exists, then it is unique, 
it is denoted by [a:;]-!, and it is called the inverse of [c;]. 


These ideas are connected with operators by the following considera- 
tions. Suppose that [a;;] is the matrix of an operator T relative to B. 
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We know that the non-singularity of T' is equivalent to the existence of an 
operator 7—! such that 
TT = TT =I. 


The isomorphism of Theorem A transforms this operator equation into 
the matrix equation 
[TI[T—] = [TT] = W), 

which is equivalent to 

[oT] = [T-lass] = [85]. 
We therefore have 
Theorem B. Let B be a basis for H, and T an operator whose matrix 
relative to B is [a]. Then T is non-singular = [a,j] is non-singular, and in 
this case [a,j}-? = [T~']. 

There is one further issue which requires discussion. If T is a fixed 
operator on H, then its matrix [T]s relative to B obviously depends on 
the choice of B. If B changes, how does [7] change? More specifi- 
cally, if B’ = {fi, fe, . . . , fn} 1s also a basis for H, what is the relation 
between [7]z and [7]s-? The answer to this question is best given in 
terms of the non-singular operator A defined by Ae; = f;. Let [ai] 
and [6,,;] be the matrices of 7 relative to B and B’, so that 


n 
Te; = > 430; 


t=1 


and Tf; = 2%, Bf: Let [yi] be the matrix of A relative to B, so that 
Ae; = Zi, vie: By Theorem B, [y;;] is non-singular. We now compute 
Tf; in two different ways: 


Tf; re > Bis fe = > BrzAex 
k=] — ‘ 
7 > Brj oF vrei) 
5 ae 
= > (> yitBrs) x5 
i=l ‘kel 


and Tf, = TA; =T (> hiek) 
e=1 


T 

M: 
= 
K3 
g 


i] 


n 


w (¥ eae) 


t=1 
n 


(> aiKYei) ey. 


k=1 


q 
M: 
=) 


ty 
tl 
ran 


Il 
M: 


« 
i 
~ 
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A comparison of these results shows that 
> Yury = > ORV Kj 
k= k=1 


[vs l8is] = [easly] 
or [8:) = [va] Lolly]. (7) 
If we now write this in the form 

[T]e = [A]e“[T]a{A]z, 


then it becomes quite clear how the matrix of JT changes when B is 
replaced by B’. 

Two matrices [a,;] and [6,;] are said to be similar if there exists a 
non-singular matrix [y,;] such that (7) is true. The analysis given above 
proves half of the following theorem (we leave the proof of the other half 
to the reader). 


for all ¢ and j, so 


Theorem C. Two matrices in A, are similar = they are the matrices of a 
single operator on H relative to (possibly) different bases. 


We are now in a position to formulate the fundamental problem of 
the classical theory of matrices. A given operator on H may have many 
different matrices relative to different bases, and Theorem C shows in 
purely matrix terms how these matrices are related to one another. The 
question arises as to whether it is possible to find, for each operator (or for 
each operator of a special kind), a basis relative to which its matrix 
assumes some particularly simple form. This is the canonical form 
problem of matrix theory, and the most important theorem in this direc- 
tion is the spectral theorem, which we state in the language of matrices 
in Sec. 62. In the classical approach to these ideas, it was customary to 
work exclusively with matrices. However, the great advances in the 
understanding of algebra which have taken place in recent years have 
made it plain that problems of this kind are best treated intrinsically, 
that is, directly in terms of the linear spaces and linear transformations 
involved. As matters now stand, it is possible—and preferable—to state 
the main canonical form theorems of matrix theory without mentioning 
matrices at all. Nevertheless, matrices remain useful for some purposes, 
notably (from our point of view) in the problem of proving that an 
arbitrary operator on H has an eigenvalue. 


Problems 


1. Show that the dimension of @(A) is n?. 
2. <A scalar matriz in A, is one which has the same scalar in every posi- 
tion on the main diagonal and 0’s elsewhere. Show that a scalar 
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matrix commutes with every matrix, and that a matrix which 
commutes with every matrix is necessarily scalar. What does this 
imply about @(H)? (See Problem 45-3.) 

3. A diagonal matriz in A, is one which has arbitrary scalars on the main 
diagonal and 0’s elsewhere. Show that all diagonal matrices com- 
mute with one another, and that a matrix is necessarily diagonal if it 
commutes with all diagonal matrices. 

4. Complete the proof of Theorem C. 

5. Let @ bea fixed real number, and show that the following two matrices 
in Ae are similar: 


i 6 —sin | mee ie 0 | 

sin 6 cos 6 0 e8 
(Hint: let T be the operator on J} whose matrix relative to the basis 
B = {e1,e2}—where e; = (1,0) and e2 = (0,1)—is the first of those 
given, and find another basis B’ = {f1,f2} such that Tf, = ef; and 
Te = e*f2.) 

6. Let 7: and 7, be operators on H, and show that there exist bases 
B and B’ such that [7,Jz = [T.]ea <= there exists a non-singular 
operator A such that 7, = ATiA—\. (Hint: if [Tile = [T2]x-, let 
A be the operator which carries B onto B’; and if T, = ATA, let 
B be any basis and B’ its image under A.) 


61. DETERMINANTS AND THE SPECTRUM OF AN OPERATOR 


Determinants are often advertised to students of elementary 
mathematics as a computational device of great value and efficiency for 
solving numerical problems involving systems of linear equations. This 
is somewhat misleading, for their value in problems of this kind is very 
limited. On the other hand, they do have definite importance as a 
theoretical tool. Briefly, they provide a numerical means of distinguish- 
ing between singular and non-singular matrices (and operators). 

This is not the place for developing the theory of determinants in any 
detail. Instead, we assume that the reader already knows something 
about them, and we confine ourselves to listing a few of their simpler 
properties which are relevant to our present interests. 

Let [a,;] be ann X n matrix. The determinant of this matrix, which 
we denote by det([a]), is a scalar associated with it in such a way that 

(1) det([é,]) = 1; 

(2) det([a;][8;]) = det([a,,]) det([6,,]) ; 

(3) det([a]) 4 0 = [a,,] is non-singular; and 

(4) det({a;] — [6,,]) is a polynomial, with complex coefficients, of 

degree n in the variable i. 
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The determinant function det is thus a scalar-valued function of matrices 
which has certain properties. In elementary work, the determinant of a 
matrix is usually written out with vertical bars, as follows, 


M11 G2 * °° Ain 
det([asl) = |r Ph 
Qni GUn2 °° ° Qnr 


and is evaluated by complicated procedures which are of no concern to 
us here. 

We now consider an operator T on H. If Band B’ are bases for H, 
then the matrices [a,;] and [8,;] of T relative to B and B’ may be entirely 
different, but nevertheless they have the same determinant. For we 
know from the previous section that there exists a non-singular matrix 
[y.j] such that 

[8:3] = [vs] Leal ya); 
and therefore, by properties (1), (2), and (3), we have 


det([B,;]) = det([yij]—Les]lva]) 
= det([yj]-") det([ay]) det([ys)) 
= det([yi}-1) det([y.]) det([as]) 
= det([va}“Lvs]) det([a,,]) 
= det([6:]) det ([a,,]) 
= det([a,;)). 


This result allows us to speak of the determinant of the operator T, meaning, 
of course, the determinant of its matrix relative to any basis; and from 
this point on, we shall regard the determinant function primarily as a 
scalar-valued function of the operators on H. We at once obtain the 
following four properties for this function, which are simply translations 
of those stated above: 

(1!) det(Z) = 1; 

(2) det(717T2) = det(7T1) det(T2); 

(3) det(T) ¥ 0 © T is non-singular; and 

(4) det(T — AJ) is a polynomial, with complex coefficients, of 

degree n in the variable i. 
We are now in a position to take up once again, and to settle, the problem 
of the existence of eigenvalues. 

Let T be an operator on H. If we recall Problem 44-6, it is clear 
that a scalar \ is an eigenvalue of T = there exists a non-zero vector x 
such that (T —AJ)x =0@T — NDI is singular = det(T — XJ) = 0. 
The eigenvalues of T are therefore precisely the distinct roots of the 
equation 

det(T — AZ) = 0, (1) 
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which is called the characteristic equation of T. It may illuminate matters 
somewhat if we choose a basis B for H, find the matrix [a] of T relative 
to B, and write the characteristic equation in the extended form 


a1, — A Q12 os Qin 
21 Qe. — Xr Aen =0 
Ani Qn2 Ann — r 


Our search for eigenvalues of T is reduced in this way to a search for roots 
of Eq. (1). Property (4’) tells us that this is a polynomial equation, 
with complex coefficients, of degree n in the complex variable \. We now 
appeal to the fundamental theorem of algebra, which guarantees that an 
equation of this kind always has exactly n complex roots. Some of these 
roots may of course be repeated, in which case there are fewer than n 
distinct roots. In summary, we have 


Theorem A. If T is an arbitrary operator on H, then the eigenvalues of T 
constitute a non-empty finite subset of the complex plane. Furthermore, the 
number of points in this set does not exceed the dimension n of the space H. 


The set of eigenvalues of T is called its spectrum, and is denoted by 
o(T). For future reference, we observe that o(T7’) is a compact subspace 
of the complex plane. 

It should now be reasonably clear why we required in the definition 
of a Hilbert space that its scalars be the complex numbers. The reader 
will easily convince himself that in the Euclidean plane the operation of 
rotation about the origin through 90 degrees is an operator on this real 
Banach space which has no eigenvalues at all, for no non-zero vector is 
transformed into a real multiple of itself. The existence of eigenvalues is 
therefore linked in an essential way to properties of the complex numbers 
which are not enjoyed by the real numbers, and the most significant 
of these properties is that stated in the fundamental theorem of algebra. 
The mechanism of matrices and determinants turns out to be simply a 
device for making effective use of this theorem in our basic problem of 
proving that eigenvalues exist. We also remark that Theorem A and its 
proof remain valid in the case of an arbitrary linear transformation on 
any complex linear space of finite dimension n > 0. 


Problems 


1. Let T be an operator on H, and prove the following statements: 
(a) T is singular = 0 €o(T); 
(b) if T is non-singular, then \ €¢(T) @ A~! € o(T—}); 
(c) if A is non-singular, then o(ATA-) = o(T); 
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(d) if X€o(T), and if p is any polynomial, then p(A) €o(p(T)); 
(e) if T* = 0 for some positive integer k, then o(7’) = {0}. 

2. Let the dimension n of H be 2, let B = {e1,e2} be a basis for H, and 
assume that the determinant of a 2 X 2 matrix [a,,;) is given by 
Q11&@22 — 1201. 

(a) Find the spectrum of the operator T on H defined by Te; = e2 
and Te: = —é. 

(b) If 7 is an arbitrary operator on H whose matrix relative to B is 
[ax], show that JT? — (aun + 22)T + (ay1e22 = 12001) 1 = 0. 
Give a verbal statement of this result. 


62, THE SPECTRAL THEOREM 


We now return to the central purpose of this chapter, namely, the 
statement and proof of the spectral theorem. 

Let 7 be an arbitrary operator on H. We know by Theorem 61-A 
that the distinct eigenvalues of JT form a non-empty finite set of complex 
numbers. Let A1, \2, . . . , Am be these eigenvalues; let Mi, Mo, ... , 
M,, be their corresponding eigenspaces; and let Pi, P2, . . . , Pm be the 
projections on these eigenspaces. We consider the following three 
statements. 

I. The M,’s are pairwise orthogonal and span H. 
II. The P,’s are pairwise orthogonal, J] = 27, P;,and T = 27, Pi. 

III. 7 is normal. 

We take the spectral theorem to be the assertion that these statements are 
all equivalent to one another. It was proved in the introduction to this 
chapter that I = II = III. We now complete the cycle by showing that 
II= 1. 

The hypothesis that T is normal plays its most critical role in our 
first theorem. 


Theorem A. If T is normal, then x is an eigenvector of T with eigenvalue 
\ © 2 is an eigenvector of T* with eigenvalue X. 

proor. Since T is normal, it is easy to see that the operator 7 — AI 
(whose adjoint is 7* — XJ) is also normal for any scalar }\. By Theorem 
58-C, we have 


Pz — del] = || 7*2 — Xa] 


for every vector z, and the statements of the theorem follow at once from 
this. 


The way is now clear for 


Theorem B. If T is normal, then the M,’s are pairwise orthogonal. 
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proor. Let x; and 2; be vectors in M; and M; for i #j, so that 
Tx; = \; and Tx; = d;2;. The preceding theorem shows that 


Ai(ai,zj) = (ste,2;) = (T2i,2;) = (ai, T*2;) 
= (@i,Ajj) = Aj (2,25); 
and since \; ~ }j, it is clear that we must have (z;,2;) = 0. 


Our next step is to prove that the M,’s span H when T is normal, and 
for this we need the following preliminary fact. 


Theorem C. If T is normal, then each M; reduces T. 

PROOF. It is obvious that each M; is invariant under 7, so it suffices, 
by Theorem 59-C, to show that each M; is also invariant under T*. 
This is an immediate consequence of Theorem A, for if 2; is a vector in 
My, so that Tx; = d,2;, then 7'*x,; = },2; is also in Mj. 


Finally, we have 


Theorem D. If T is normal, then the M,’s span H. 

proor. The fact that the M,’s are pairwise orthogonal implies, by 
Theorems 59-F and 59-G, that M=M,+M.+:-::+Mn is a 
closed linear subspace of H, and that its associated projection is 


PSPi} Pekin ob Pe: 


Since each M; reduces 7, we see by Theorem 59-E that TP; = P;T for 
each P;. It follows from this that TP = PT, so M also reduces T, and 
consequently M+ isinvariant under 7. If M+ + {0}, then, since all the 
eigenvectors of 7 are contained in M, the restriction of T to M+ is an 
operator on a non-trivial finite-dimensional Hilbert space which has no 
eigenvectors, and hence no eigenvalues. Theorem 61-A shows that this 
is impossible. We therefore conclude that M+ = {0}, so M = H and 
the M,’s span H. 


This completes the proof of the spectral theorem and, in particular, 
of the fact that if 7 is normal, then it has a spectral resolution 
T = riPi + dePe + > ++ + AnPm. (1) 


We now make several observations which will be useful in carrying out 
our promise to show that this expression for T is unique. Since the P,’s 
are pairwise orthogonal, if we square both sides of (1) we obtain 


™m™ 
T? = > r2Py. 
t=] 
More generally, if m is any positive integer, then 
T = > r»9P;. (2) 


t=] 
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If we make the customary agreement that 7° = J, then the fact that 
I = 22, P; shows that (2) is also valid for the case n = 0. Next, let 
p(z) be any polynomial, with complex coefficients, in the complex variable 
z. By taking linear combinations, (2) can evidently be extended to 


p(T) = 2 pr)Pe. (3) 
We would like to find a polynomial p such that the right side of (3) 
collapses to a specified one of the P,’s, say P;. What is needed is a 
polynomial p; with the property that p,(\,) = 0 if 1 #7 and p,(d;) = 1. 
We define 7; as follows: 


pyle) = TMD t+ — dae = Ny) a) 
; (yy — An) + Ag AO — Aye) 0 AG mm) 
Since p; is a polynomial, and since p,(d;) = 44, (3) yields 
P; = p(T). (4) 


In order to interpret these remarks to our advantage, we point out that 
only three facts about (1) have been used in obtaining (4): the ,’s are 
distinct complex numbers; the P,’s are pairwise orthogonal projections; 
and J = 32, P;. By using these properties of (1), and these alone, we 
have shown that the P,’s are uniquely determined as specific polynomials 
in T. 

We now assume that we have another expression for 7 similar to (1), 


T = a1Q: + a2Q2 + +--+ + Q:, (5) 


and that this is also a spectral resolution of 7’, in the sense that the a,’s 
are distinct complex numbers, the Q,’s are non-zero pairwise orthogonal 
projections, and J = 3%_,Q;. We wish to show that (5) is actually 
identical with (1), except for notation and order of terms. We begin by 
proving, in two steps, that the a,’s are precisely the eigenvalues of T. 
First, since Q; ~ 0, there exists a non-zero vector x in the range of Q;; and 
since Qiz = x and Q;r = 0 for j ¥ i, we see from (5) that Tx = a,x, so 
each a; is an eigenvalue of T. Next, if \ is an eigenvalue of T, so that 
Tx = dx for some non-zero z, then 


k k 
Tz = de = z= d2 Qa = 2 Ot 


t=1 


k 
and Tz = > ads, 


i=l 


k 
50 > A — ad)Qa = 0. 


t=1 
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Since the Q,z’s are pairwise orthogonal, the non-zero vectors among 
them—there is at least one, for z ~ O—are linearly independent, and 
this implies that \ = a; for some 7. These arguments show that the set 
of a,’s equals the set of Xs, and therefore, by changing notation if 
necessary, we can write (5) in the form 


T = iQ + 2Qe + + + + + AmQn. (6) 
The discussion in the preceding paragraph now applies to (6) and gives 
Q; = p;(T) (7) 


for every j7. On comparing (7) with (4), we see that the Q,’s equal the 
P;’s. This shows that (5) is exactly the same as (1)—except for notation 
and the order of terms—and completes our proof of the fact that the 
spectral resolution of T is unique. 

We conclude with a brief look at the matrix interpretation of state- 
ments I and II at the beginning of this section. Assume that I is true, 
that is, that the eigenspaces M1, Mo, . . . , M, of T are pairwise orthog- 
onaland span H. For each M,, choose a basis which consists of mutually 
orthogonal unit vectors. This can always be done, for a basis of this 
kind—called an orthonormal basis—is precisely a complete orthonormal 
set for M;. It is easy to see that the union of these little bases is an 
orthonormal basis for all of H; and relative to this, the matrix of T has 
the following diagonal form (all entries off the main diagonal are under- 
stood to be 0): 


A 


re 


re 8) 


Am 
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We next assume that H has an orthonormal basis relative to which the 
matrix of 7 is diagonal. If we rearrange the basis vectors in such a way 
that equal matrix entries adjoin one another on the main diagonal, then 
the matrix of T relative to this new orthonormal basis will have the form 
(8). It is easy to see from this that T can be written in the form 


T= > Pi, 
i=1 


where the ),’s are distinct complex numbers, the P,’s are non-zero pairwise 
orthogonal projections, and J = 27, P;. The uniqueness of the spectral 
resolution now guarantees that the X,.’s are the distinct eigenvalues of T 
and that the P,’s are the projections on the corresponding eigenspaces. 
The spectral theorem tells us that statements I, II, and III are equivalent 
to one another. The above remarks carry us a bit further, for they 
constitute a proof of the fact that these statements are also equivalent to 
IV. There exists an orthonormal basis for H relative to which the 
matrix of T is diagonal. 
It is interesting to realize that the implication III = IV, which we 
proved by showing that III =I and I= IV, can be made to depend 
more directly on matrix computations. This proof is outlined in the 
last three problems below. 


Problems 


1. Show that an operator 7 on H is normal © its adjoint T* is a poly- 
nomial in T. 

2. Let T be an arbitrary operator on H, and N a normal operator. 
Show that if 7 commutes with N, then T also commutes with N*. 

3. Let 7 be a normal operator on H with spectrum {)i, Ae, . . . , Am}, 
and use the spectral resolution of T to prove the following statements: 
(a) T is self-adjoint + each ), is real; (6) T is positive = A; 2 0 for 
each 7; (c) T is unitary = |),| = 1 for each 7. 

4, Show that a positive operator 7 on H has a unique positive square 
root; that is, show that there exists a unique positive operator A on 
H such that A? = T. 

5. Let B = {e1, e, ..., én} be an orthonormal basis for H. If T is 
an operator on H whose matrix relative to B is [a;;], show that the 
matrix of 7* relative to B is [6;], where 6;; = aj. [8,;] is often called 
the conjugate transpose of [aij]. 

6. Let T be an arbitrary operator on H, and prove that there exist n 
closed linear subspaces M,, Mo, . . . , M, such that 


{0} CM:CM2.C:::CM,=H, 


the dimension of each M; is 7. and each M; is invariant under T 
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(Hint: if n = 1, the statement is clear; and if n > 1, assume it for all 
Hilbert spaces of dimension n — 1, and prove it for H by using 
Theorem 59-B and the fact that T* has an eigenvector.) 

7. Let T be an arbitrary operator on H, and use the previous problem 
to show that there exists a basis B relative to which the matrix 
[a,;,] of T is triangular, in the sense that 1 >j=-a,;=0. If T is 
normal, show that there exists an orthonormal basis B’ relative to 
which the matrix of T is diagonal. (Hint: generate B’ by applying 
the Gram-Schmidt process to B, observe that the matrix of T relative 
to B’ is still triangular, and use Problem 5 to show that this matrix is 
actually diagonal.) 


63. A SURVEY OF THE SITUATION 


The spectral theorem is often stated in a somewhat more restricted 
form than that given in the previous section. The usual version is that 
each normal operator N on H has a spectral resolution, that is, that there 
exist distinct complex numbers Ay, \2, . . - , Am and non-zero pairwise 
orthogonal projections Pi, Pe, . . . , Pm such that 22, P; = I, with the 
property that 

m 
N = D?Pi. (1) 


= 


~ 


In our version, we attempted to give equal emphasis to both the geo- 
metric and the algebraic sides of the matter. Most writers, however, 
confine their statement of the theorem to that given above, and for a 
very good reason: it is (1) that generalizes to the infinite-dimensional case. 

There are two ways of carrying out this generalization, and we give a 
brief description of each. 

First, there is the analytic approach. For the sake of simplicity, we 
consider a self-adjoint operator A, and we write (1) in the form 


A= > Pi. (2) 
i=1 
Our reason for making this assumption is that the eigenvalues of A are 
real numbers and are therefore ordered in a natural way. We further 
assume that the notation in (2) is chosen so that \1 < A» < °° * < Am, 
and we use the P,’s to define new projections: 


E,, = 0; 
Fy, = Pi; 
Ey, = Pi + Po; 


Ey, = Pit Prt-+ + +Pn 
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(The subscript Ao is introduced solely for notational convenience and has 
no significance beyond this.) The £,,’s enable us to rewrite (2) as follows: 


A = AiPit APet +++ + nPm 
M(B, — Ey.) + A2(Fa, — Fay) +o + + mE, ~ Baus) 


mm 


> (Ey, — Fy.) 


t=1 


If we denote £,, — E,,_, by AF),, then we can compress this to 


A = > iG AE),,, 


i= 
which suggests an integral representation 
A= f dE). (3) 


In this form, the spectral resolution remains valid for self-adjoint opera- 
tors on infinite-dimensional Hilbert spaces. A similar result holds for 
normal operators, 

N= | dE. (4) 


There are many difficulties to be surmounted in reaching the level of (3) 
and (4). We have already met one of these, namely, the fact that an 
operator T on an arbitrary Hilbert space H ~ {0} need not have any 
eigenvalues at all. In this general case, the spectrum of T is defined by 


o(T) = {A:T — XI is singular}. 


When 4 is finite-dimensional, we have seen that o(7’) consists entirely of 
eigenvalues. This made our work in the present chapter relatively easy, 
but it is not true in general. What is true is that o(T) is always non- 
empty, closed, and bounded, and is thus a compact subspace of the 
complex plane. Once this difficulty is dealt with, there remain sub- 
stantial problems in giving meaning to integrals like those in (3) and (4) 
and in proving the validity of these relations.! 

The second approach to generalizing (1) is essentially algebraic and 
topological in nature. Its starting point is the observation made in the 
previous section that the spectral resolution 


N= > MUP: 
i=1 


1¥For a general discussion of the spectral theorem from this analytic point of 
view, see Lorch [28]. A full treatment can be found in Riesz and Sz.-Nagy [35, 
chap. 7]. 
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leads to, and is actually part of, the fact that 
P(N) = 2 POPs (5) 


for any polynomial p. The set of all polynomials in N is evidently an 
algebra of operators, a subalgebra of @(H). Let us now consider the 
corresponding algebra of all polynomial functions defined on the set of 
ds. We have seen that this algebra contains polynomials p; such that 
pi(\:) = 8, and it therefore consists of all complex functions defined on 
the set of X’s. If we denote the latter set by X for a moment and think 
of it as a compact subspace of the complex plane, then, since X is finite, 
the algebra in question is precisely C(X), the algebra of all continuous 
complex functions defined on X. The mapping 


p(N) > p, (6) 


which makes correspond to each p(N) the function p in C(X), is easily 
seen by the properties of (5) to preserve all algebraic operations. We 
know from the previous section that the eigenvalues of p(N) are—with 
possible repetitions—the p(\)’s on the right of (5); and we shall see 
later, as an unexpected dividend, that the norm of a normal operator 
always equals the maximum of the absolute values of its eigenvalues. 
It follows from these remarks that the mapping (6) is an isometric 
isomorphism of the algebra of all p(N)’s onto C(X). These ideas con- 
stitute an extended version of (5) and are thus, in a sense, a generalization 
of the spectral resolution of N. They apply virtually without change to 
the case of a normal operator on an infinite-dimensional Hilbert space, 
and one of our aims in the next three chapters is to treat them in detail. 


PART THREE 


Algebras of Operators 


CHAPTER TWELVE 


General 


Preliminaries on Banach Algebras 


Our work in Part 1 of this book was primarily concerned with 
topological spaces and the continuous functions carried by them. 

The ideas of Part 2, on the other hand, were essentially algebraic in 
nature. The function spaces we encountered earlier led us to begin a 
study of Banach spaces for their own sake, and as we proceeded, we 
found our attention focusing more and more closely on the properties 
of their operators. Except for a few elementary notions about metric 
spaces, we used very little genuine topology in Part 2. Asa matter of 
fact, our treatment of spectral theory in Chap. 11 was completely inde- 
pendent of topology, for in the finite-dimensional case, all linear trans- 
formations are continuous. 

In the following three chapters, these two apparently diverse trains 
of thought—the topological and the algebraic—are united by a single 
elegant concept: that of a Banach algebra. Our remarks in the last 
section of the previous chapter suggested that there may be important 
links between algebras of operators on Hilbert spaces and algebras of the 
type €(X), where X is a compact Hausdorff space. Banach algebras are 
the systems which enable us to establish these connections on a firm 
footing. They are also interesting in that they constitute a field of study 
in which a wide variety of mathematical ideas meet in significant contact. 

Our main task in the present chapter is to provide a number of 
miscellaneous tools which are necessary for the structure theory developed 


later. 
301 


302 Algebras of Operators 


64. THE DEFINITION AND SOME EXAMPLES 


A Banach algebra is a complex Banach space which is also an algebra 
with identity 1, and in which the multiplicative structure is related to the 
norm by the following requirements: 

(1) [lzyll < [ell yl; 

(2) [lil] = 1. 

It follows from (1) that multiplication is jointly continuous in any 
Banach algebra, that is, that if z,—> 2 and y,— y, then x,y, — zy 
(proof: 


{|znyn — xyll = |laa(yn — y) + (en — z)yll < |lznll lly» — yl 
+ |lan — 2 llyll). 


A Banach subalgebra of a Banach algebra A is a closed subalgebra of A 
which contains 1. The Banach subalgebras of A are precisely those 
subsets of A which are themselves Banach algebras with respect to the 
same algebraic operations, the same identity, and the same norm. 

The definition of a Banach algebra is sometimes given without the 
restriction that the scalars are the complex numbers. The complex case, 
however, is the only one that concerns us, and by framing the definition 
as we do, we avoid the necessity of treating the additional complications 
which arise in the real case. We have further assumed, for the sake of 
simplicity, that every Banach algebra has an identity. It is possible, 
at a considerable sacrifice of clarity, to develop most of the important 
ideas without this assumption, and this is done whenever the primary 
purpose of the theory is the study of group algebras of locally compact 
but not discrete groups. Since our attention will be directed chiefly to 
the structure of operator algebras, there is no need for us to strain for the 
added generality obtained by not requiring the presence of an identity. 

The Banach algebras of principal interest to us are described in the 
following examples. The reader will notice that they all consist of 
functions or operators and that the linear operations in all of them are 
defined pointwise. They can be classified in a general way into function 
algebras, operator algebras, or group algebras, according as multiplication 
is defined pointwise, by composition, or by convolution. 


Example 1. (a) One of the most important Banach algebras is the 
set @(X) of all bounded continuous complex functions defined on a 
topological space X. The case in which X is a compact Hausdorff space 
will have particular significance for our later work. If X has only one 
point, then C(X) can be identified with the simplest of all Banach algebras, 
the algebra of complex numbers. 

(b) Consider the closed unit disc D = {z:|z| < 1} in the complex 
plane. The subset of C(D) which consists of all functions analytic in the 
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interior of D is obviously a subalgebra which contains the identity. <A 
simple application of Morera’s theorem from complex analysis shows that 
it is closed and is therefore a Banach subalgebra of C(D). This Banach 
algebra is called the disc algebra. It has a number of interesting proper- 
ties, which are, of course, intimately related to the special character of 
its functions. 


Example 2. (a) If B is a non-trivial complex Banach space, then the 
set @(B) of all operators on B isa Banach algebra. We assume that B is 
non-trivial in order to guarantee that the identity operator is an identity 
in the algebraic sense. 

(b) If we consider a non-trivial Hilbert space H, then @(A) is a 
Banach algebra. This is a special case of @(B), and it is important to 
observe that additional structure is present here, namely, the adjoint 
operation T — T*. 

(c) A-subalgebra of @(H) is said to be self-adjoint if it contains the 
adjoint of each of its operators. Banach subalgebras of @®(H)’s which 
are self-adjoint are called C*-algebras. We shall return to the subject 
of commutative C*-algebras in Chap. 14. 

(d) The weak operator topology on @(H) is the weak topology gen- 
erated by all functions of the form T — (Tzx,y); that is, it is the weakest 
topology with respect to which all these functions are continuous. It 
is easy to see from the inequality |(Tz,y) — (Toz,y)| < ||T — Toll lz] Ilyll 
that this topology is weaker than the usual norm topology, so that its 
closed sets are also closed in the usual sense. A C'*-algebra with the 
further property of being closed in the weak operator topology is called 
a W*-algebra. Algebras of this kind are also called rings of operators, 
or von Neumann algebras. They are among the most interesting of all 
Banach algebras, but their theory is quite beyond the scope of this book.} 


Example 3. (a) If G = {g:, go... , gn} is a finite group, then its 
group algebra L(G) is the set of all complex functions defined on G. 
Addition and scalar multiplication are defined pointwise, and the norm 
by |{fll = 221 |f(gi|. In order to see what underlies the definition of 
multiplication, it is convenient to regard a typical element f of L(G) 
as a formal sum 22 agi, where a; is the value of f at g;. With this 
interpretation, we use the given multiplication in G to define multiplica- 
tion in L,(G), as follows: 


( p> agi) ( > B91) = > VGes (1) 

iz j=l =1 
where y= Deivimes ai}. (2) 
The meaning of the sum in (2) is that the summation is to be extended 


1See Dixmier [7]. 


304 Algebras of Operators 


over all subscripts 7 and 7 such that gg; = gi. In effect, therefore, we 
formally multiply out the sums on the left of (1), and we then gather 
together all the resulting terms which contain the same element of G. 
With these ideas as an intuitive guide, we revert to our first point of 
view, in which the elements of Z,(G) are functions, and we see that our 
definition of multiplication can be expressed in the following way. If 
two functions f and g in L,(G) are given, then their product, which is 
denoted by f *g and ealled their convolution, is that function whose 
value at g; is 


(Ff * 9) Ge) = Dowimos f(G:)9(g:) 
= p> $977) 995). (3) 


We note that if each element of G is identified with the function whose 
value is 1 at that element and 0 elsewhere, then G becomes a subset of 
L(G). Further, multiplication in G agrees with convolution in L(G), 
and the element of Z,(@) which corresponds to the identity in G is an 
identity for L(G). We conclude this description by observing that every 
element of G has norm 1, so that ||1|| = 1, and that the basic norm 
inequality for a Banach algebra is satisfied: 


3 


If *gll = 2 IC * 9)(ge)| 


fl 
v= 


1d Sea o(9s)| 


j=l 


Fo 
Ms 
8 


IA 
M: 


to 
ft 
ry 

». 


> fa |g(os)| 


a 


> 2 \f(gegs*)| lg (gs) 


vi iM 


, lato) > fgg) 


3 los) III 
= |fll > \9(9s)| 
= |Ifll lig! 


(6) Let G = {...,—2, —1,0, 1,2, .. .} be the additive group 
of integers. Its group algebra L,(G) is the set of all complex functions f 
defined on G for which 2°__,, |f(n)| converges. The linear operations 
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are defined pointwise, the norm by ||f|| = 2%__.. |f(n)|, and the convolu- 
tion of f and g—see Eq. (3)—by 


(f*g)(n) = DY f(n — m)g(m). 


m= — 


Just as in (a), G is contained in Z,(G) in a natural way, and L(G) isa 
Banach algebra. Any attempt to discuss the group algebra of a non- 
discrete topological group like the real line must clearly be based on an 
adequate theory of integration. It should also have available a theory 
of Banach algebras in which no identity is assumed to be present. These 
ideas constitute a rich and beautiful field of modern analysis. They are, 
however, outside the scope of this work.! 


The Banach algebras described above are many and diverse, and 
there are yet others which we have not mentioned. Our attention in 
the following chapters will be centered on @(X)’s and commutative 
C*-algebras, but the general theory we develop is equally applicable to 
all. It is worthy of notice that an arbitrary Banach algebra A can be 
regarded as a Banach subalgebra of @(A). Ina sense, therefore, Exam- 
ple 2a and its Banach subalgebras include all possible Banach algebras. 
To see this, we recall from Problem 45-4 that a — M., where M(x) = az, 
isan isomorphism of A into @(A). It is easy to see that M;, is the identity 
operator on A, so all that remains is to observe that ||a|| = ||4/.|| for every 
a (proof: ||Ma(z)|| = |laz|| < |la|| ||z|| shows that ||M!.ll < |lal], and the 
fact that |a|| < ||/.|| follows from 


Mal] = sup {||Mo(z)I]: [lz] < 1} 2 {1440(1)|| = llall). 


The mapping a— M™, is thus an isometric isomorphism of A onto a 
Banach subalgebra of @(A), and it allows us to identify the abstract 
Banach algebra A with a concrete Banach algebra of operators on A. 


65. REGULAR AND SINGULAR ELEMENTS 


Let A be a Banach algebra. We denote the set of regular elements 
in A by G, and its complement, the set of singular elements, by S. It is 
clear that G contains 1 and is a group, and that S contains 0. Several 
important issues depend on the character of G and S. Our first result 
along these lines is 


1 Loomis [27] is the standard reference in this subject. For a general exposition 
of the main ideas, see Mackey [30]. A brief treatment of the classical analysis which 
underlies the modern theory can be found in Goldberg [14]. 
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Theorem A. Every element x for which ||x — 1|| < 1 is regular, and the 
inverse of such an element is given by the formula x-! = 1 + 2°_, (1 — x)". 


n=l 
prooF. If we put r = ||z — 1||, so that r < 1, then 


IQ = 2] < | — alt = 


shows that the partial sums of the series 27_, (1 — x)" form a Cauchy 
sequence in A. Since A is complete, these partial sums converge to an 
element of A, which we denote by 27_, (1 — x)". If we define y by 
y = 1+ 27.,(1 — x)*, then the joint continuity of multiplication in A 
implies that 


y—sy= (1 —ay= (1-2) + Da = aay], 
sozy = 1. Similarly, yx = 1. 
We now use this as a tool to prove 


Theorem B. G is an open set, and therefore S is a closed set. 


PROOF. Let 2 be an element in G, and let x be any element in A such 
that ||z — xol| < 1/||zo7}||. It is clear that 


lox — 1\] = |lao-e — 2o)|| < |]ao-"|] Ila — aol] < 1, 


so we see by Theorem A that 2 ~'z is in G. Since + = 2xo(ao71z), it 
follows that z is also in G, so G is open. 


It was shown in Problem 32-5 that every Banach space is locally 
connected, so A is also locally connected. A direct application of 
Theorem 34-A yields the fact that the components of G are themselves 
open sets. 

As our final result, we have 


Theorem C. The mapping x— x"! of G into G is continuous and is 
therefore a homeomorphism of G onto itself. 


PRooF. Let xo be an element of G, and x another element of G such 
that |lz — xol| < 1/(2||zo71||). Since 


|zo~1e — 1|] = aoa — zo)|| < |z0- |] lz — oll < 4, 


we see by Theorem A that zo~'z is in G and 


ta = (tox) 1 = 14+ > (1 — apt)”. 
n=l 
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Our conclusion now follows from 


\|a-* — tot] = [[(@ "0 — 1x07" S [zo] [la~*z¥0 — AI] = |]z0- I 
|X, = ae teyell < lot BI — aorta 


= |lao]] [1 — 20-2 2 — 20" 2||" 
f 
\o-?|| [[1 — zo-tz|| 
1 — |l1 — 272] 
< 2l|xo-|] |]1 — xo al] < 2I]xo-*||?\z — aol]. 


If x is an element in A, it should always be kept in mind that the 
regularity or singularity of z depends on A as well as on z itself. If x 
is regular in A, and if we pass to a Banach subalgebra A’ of A which also 
contains z, then x may lose its inverse and become singular in A’. By 
the same token, if x is singular in A, and if A is regarded as a Banach 
subalgebra of a larger Banach algebra A’’, then x may acquire an inverse 
and become regular in A”. In the next section, we study certain ele- 
ments in A which are singular and remain singular with respect to all 
possible enlargements of A. 


66. TOPOLOGICAL DIVISORS OF ZERO 


An element z in our Banach algebra A is called a topological divisor 
of zero if there exists a sequence {z,} in A such that ||z,|| = 1 and either 
22, > Oorz,ze—0. Itis clear that every divisor of zero is also a topologi- 
cal divisor of zero. We denote the set of all topological divisors of 
zero by Z. 


Theorem A. Z is a subset of S. 


proor. Let z be an element of Z and {z,} a sequence such that ||z,|| = 1 
and (say) zz,» 0. If 2 were in G, then by the joint continuity of multi- 
plication we would have z—'(zz,) = z, — 0, contrary to ||z,|| = 1. 


Our next theorem relates to the manner in which Z is distributed 
within S. 


Theorem B. The boundary of S is a subset of Z. 

PROOF. Since S is closed, its boundary consists of all points in S which 
are limits of convergent sequences in G. We show that if z is such a 
point, that is, if z is in S and there exists a sequence {r,} in G such that 
ra — 2, then z isin Z, First, we see from 7,712 — 1 = ra7(z — rp) that 
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the sequence {r,—'} is unbounded; for otherwise, we would have 
rate — 1|] <1 


for some n, so that r,—1z, and therefore z = r,(r,—!z), would be regular. 
Since {r,~'} is unbounded, we may assume that ||r,—}|| > o. If z, is 
now defined by 2, = ra7!/||ra7 ||, then our conclusion follows from the 
observations that {|z,|| = 1 and 


arya) 1+(e@ — ra)ra7} 1 
Brn = TT FOO” SO O82 oor — Tr)en 0. 
[rT [rl Tray t @ — tae > 


In order to understand the significance of these facts, let us suppose 
that A is imbedded as a Banach subalgebra in a larger Banach algebra A’. 
As we remarked in the previous section, an element which is singular in A 
may cease to be so in A’. However, if it is a topological divisor of zero 
in A, then it is in A’ as well, so it is singular in A’. The topological 
divisors of zero in A are thus “permanently singular,” in the sense that 
they are singular and remain so with respect to every possible enlarge- 
ment of the containing Banach algebra. Theorem B tells us that no 
matter what happens to S as a whole in such a process, its boundary is 
“permanent” in this sense. 


67. THE SPECTRUM 


Let T be an operator on a non-trivial Hilbert space. In the previous 
chapter, we defined the spectrum of T' to be the set 


o(T) = {A:T.— AI is singular}, 


and we devoted a good deal of attention to the geometric ideas leading 
to this concept. We found—at least in the finite-dimensional case—that 
a number in o({T) is a value assumed by T, in the sense that T acts on some 
non-zero vector as if it were scalar multiplication by that number. We 
shall see later that this formulation of the meaning of the spectrum has a 
much wider significance than we might at first suspect. 

Let us now consider an element z in our general Banach algebra A. 
By analogy with the above, we define the spectrum of x to be the following 
subset of the complex plane: 


o(x) = {dA:2 — X1 is singular}. 


Whenever it is desirable to express the fact that the spectrum of x depends 
on A as well as z, we use the notation c4(z). It is easy to see that r — 1 
is a continuous function of \ with values in A; and since the set of singular 
elements in A is closed, it follows at once that o(z) is closed. We further 
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observe that o(z) is a subset of the closed dise {z:{z| < ||z||}, for if Nisa 
complex number such that |A| > ||z]|, then |]z/al| < 1, ||1 — (1 — 2/))|l 
<1, 1— 2/) is regular, and therefore + — d1 is regular. 

Our first task is to establish the fact that o(z) is always non-empty, 
and for this we need a few preliminary notions. The resolvent set of z, 
denoted by p(z), is the complement of o(x); it is clearly an open subset 
of the complex plane which contains {z:|z| > ||z||}. The resolvent of x 
is the function with values in A defined on p(x) by 


a(\) = (@ — AI)-h. 


Theorem 65-C tells us that x(A) is a continuous function of \; and the 
fact that z(\) = A7(2/A — 1)-! for \ +0 implies that 2(\)— 0 as 
»— o. If \ and uw are both in p(x), then 


x(d) = 2(d)[z — wl)z(u) 
= 2(r)[z — AL + (A — 2) 1]x(u) 
[1 + (A — z)z(d)Jx(u) 
= x(u) + (A — w)x(A)z(x), 
80 z(h) — (u) = (A — w)x(d)2(u). 


This relation is called the resolvent equation. 


Theorem A. a(x) ts non-empty. 

proor. Let f be a functional on A—that is, an element of the conjugate 
space A*—and define f(A) by f(A) = f(x(A)). It is clear that f(A) is a 
complex function which is defined and continuous on the resolvent set 
p(x). The resolvent equation shows that 


Z 10) = fe) af () — s(e@)z(y)), 


and it follows from this that 


Se are i ewe) i = f(x(u)?), 


so f(A) has a derivative at each point of p(x). Further, 


FMI < MFI zOOI, 


so f(\) >0asrA4— ©. We now assume that o(x) is empty, so that p(x) 
is the entire complex plane. Liouville’s theorem from complex analysis 
allows us to conclude that f(A) = 0 for all \. Since f is an arbitrary 
functional on A, Theorem 48-B implies that z(A) = O forall A. This is 
impossible, for no inverse can equal 0, and therefore it cannot be true 
that o(z) is empty. 
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If the reader is surprised by the appearance of Liouville’s theorem 
in such a context, he should recall two facts. First, our proof of Theorem 
61-A, which is a special case of the above result, required the use of the 
fundamental theorem of algebra. And second, the fundamental theorem 
of algebra is most commonly proved as a simple consequence of Liouville’s 
theorem. It is therefore only to be expected that some tool from analysis 
comparable in depth with Liouville’s theorem should be necessary for 
the proof of Theorem A. 

Now that we know that a(x) is non-empty, we also know that it is a 
compact subspace of the complex plane. The number r(z) defined by 


r(x) = sup {|A:\ Eo(z)} 


is called the spectral radius of x. It is clear that 0 < r(x) < ||z||. The 
concept of the spectral radius will be useful in certain parts of our later 
work. 

We recall that a division algebra is an algebra with identity in which 
each non-zero element is regular. The most important single conse- 
quence of Theorem A is 


Theorem B. If A is a division algebra, then it equals the set of all scalar 
multiples of the identity. 

proor. We must show that if x is an element of A, then x equals Al for 
some scalar \. Suppose, on the contrary, that 2 ¥ Al for every X. 
Then x — Al ¥ 0 for every \, x — X1 is regular for every \, and therefore 
a(x) isempty. This contradicts Theorem A and completes the proof. 


The mapping Al — d is clearly an isometric isomorphism of the set 
of all scalar multiples of the identity onto the Banach algebra C of all 
complex numbers. We may therefore identify this set with C; and in 
terms of this identification, Theorem B says that any Banach algebra 
which is a division algebra equals C. This fact is the foundation on which 
we build the structure theory presented in the next chapter. 

It is obvious that C itself, which is the simplest of all Banach alge- 
bras, is a division algebra, so Theorem B characterizes C as the only 
Banach algebra with this property. In the next two theorems, we give 
some other interesting characterizations of C among all possible Banach 
algebras. 

Since 0 is a divisor of zero, it is a topological divisor of zero in every 
Banach algebra. In the Banach algebra C, 0 is plainly the only topologi- 
cal divisor of zero. Conversely, we have 


Theorem C. If 0 is the only topological divisor of zero in A, then A = C. 


PROOF. Let x be an element of A. Its spectrum o(x) is non-empty, so 
it has a boundary point A; and z — XI is easily seen to be a boundary 
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point of the set S of all singular elements. By Theorem 66-B, + — Al 
is a topological divisor of zero, so it follows from our hypothesis that 
x—rl =O0orz= Al. 


The basic link between multiplication in A and the norm is given 
by the inequality ||zy|| < ||z|| lly], and when A = C, this inequality can 
be reversed. The following result shows to what extent this reversibility 
is true in general. 


Theorem D. If the norm in A satisfies the inequality ||zy|| > K|lzll |ly|l 
for some positive constant K, then A = C. 

proor. In the light of Theorem C, it suffices to observe that the 
hypothesis here implies that 0 is the only topological divisor of zero. 


We next look into the question of what happens to the spectrum 
of an element x in A when A is enlarged. 


Theorem E. If A is a Banach subalgebra of a Banach algebra A’, then 
the spectra of an element xin A with respect to A and A’ are related as follows: 
(1) oa-(x) C oa(x); (2) each boundary point of o4(x) is also a boundary 
point of o4:(2). 

prooF. If x — X1 is singular in A’, then it is certainly singular in A, 
so (1) is clear. To prove (2), we let \ be a boundary point of o4(z). It 
is easy to see that x — X1 is a boundary point of the set of singular ele- 
ments in A, so by Theorem 66-B, it is a topological divisor of zero in A. 
It is therefore a topological divisor of zero in A’ as well, so it is singular 
in A’ and 2 is in og (xz). The fact that A is actually a boundary point 
of o4-(x) is immediate from (1), so the proof of (2) is complete. 


This result shows that in general the spectrum of an element shrinks 
when its containing Banach algebra is enlarged, and further, that since 
its boundary points cannot be lost in this process, it must shrink by 
“hollowing out.” An illuminating example of this phenomenon is 
provided by the disc algebra A of all complex functions which are defined 
and continuous on D = {z:|z| < 1} and analytic in the interior of this 
set. If fis a function in A, then the maximum modulus theorem from 
complex analysis implies that 


fil 


sup {|f(z)|:|z| < 1} 
sup {|f(z)|:|z| = 1}. 


This allows us to identify A with the Banach algebra of all the restrictions 
of its functions to the boundary of D, which is a Banach subalgebra of 
A’ = @({z:|z| = 1}). If we now consider the element f in A defined by 
f(z) = 2, then it is easy to see that o4(f) equals D and that o4/(f) equals 
the boundary of D. 
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68. THE FORMULA FOR THE SPECTRAL RADIUS 


Let x be an element in our general Banach algebra A, and consider 
its spectral radius r(x), which is defined by 


r(x) = sup {|A|:\ €o4(z)}. 


Now let A’ be the Banach subalgebra of A generated by z, that is, the 
closure of the set of all polynomials in z. Theorem 67-E shows that r(z) 
has the same value if it is computed with respect to A’: 


r(x) = sup {|\|:\ €oa-(z)}. 


This suggests quite strongly that r(x) depends only on the sequence of 
powers of x. The formula for r(z) is given in Theorem A below, and our 
purpose in this section is to prove it. It is convenient to begin with the 
following preliminary result. 


Lemma. o(x") = o(z)". 
Pproor. Let X be a non-zero complex number and Aj, Az, . . . , An its 
distinct nth roots, so that 


2” — dL = (te — Ail) (z@ — Dol) - «+ (@ — Agl). 


The statement of the lemma follows easily from the fact that x” — 1 
is singular = x — ),1 is singular for at least one 7. 


Theorem A. r(x) = lim ||x*||"/*. 
prRooF. Our lemma shows that r(x") = r(x)", and since r(x") < 
\|z"||, we have r(z)* < ||x*|| or r(x) < ||x*||/" for every n. To conclude 
the proof, it suffices to show that if a is any real number such that 
r(z) <a, then ||z*||/" < a for all but a finite number of n’s, and this we 
now do. 

It follows from Theorem 65-A and our work in Sec. 67 that if |A| > 
l|xl|, then 


2(d) = (x — Al)? = € = )" 
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If f is any functional on A, then (1) yields 


feo) = --| sa) + 3 F(Z) 


= =f f(1) + > for)" ] (2) 


for all |d| > |||]. We saw in the proof of Theorem 67-A that f(x(A)) is 
an analytic function in the region |\| > r(x); and since (2) is its Laurent 
expansion for |\| > |lz||, we know from complex analysis that this 
expansion is valid for |A| > r(x). If we now let a be any real number such 
that r(x) < a <a, then it follows from the preceding remark that the 
series 27_, f(x"/a") converges, so its terms form a bounded sequence. 
Since this is true for every f in A*, an application of Theorem 51-B shows 
that the elements r*/a” form a bounded sequence in A. Thus 


Ilz/ar|| < K 


or ||z||/" < Ka for some positive constant K and every n. Since 
K'/»q < a for every sufficiently large n, we have ||x"||!/" < a for all but a 
finite number of n’s, and the proof is complete. 


The applications we make of this formula will appear in the next 
chapter. 


69. THE RADICAL AND SEMI-SIMPLICITY 


Our final preliminary task is to reach a clear understanding of what 
is meant by the statement that our Banach algebra A is semi-simple. 
For this, it is necessary to give an adequate definition of the radical of A, 
and this in turn depends on a detailed analysis of its ideals. 

We recall that an deal in A was defined in Sec. 45 to be a subset J 
with the following three properties: 

(1) J isa linear subspace of A; 

(2) t¢2=>zie] for every element zé A; 

(3) «¢2=izeI for every element ze A. 

If 7 is assumed only to satisfy conditions (1) and (2) [or conditions (1) 
and (3)], it is called a left ideal (or a right ideal). Yor the sake of clarity 
and emphasis, an ideal in our previous sense—one which satisfies all 
three of these conditions—is often called a two-sided ideal. In the 
commutative case, of course, these three concepts coincide with one 
another. 
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The properties of the ideals in A are closely related to the properties 
of its regular and singular elements. In our work so far, the statement 
that an element z in A is regular has meant that there exists an element 
y such that zy = yx = 1. For our present purposes, it is useful to refine 
this notion slightly, as follows. We say that x is left regular if there 
exists an element y such that yx = 1; and if z is not left regular, it is 
called left singular. The terms right regular and right singular are defined 
similarly. If x is both left regular and right regular, so that there exist 
elements y and z such that yx = 1 and xz = 1, then the relation 


y = yl = y(zz) = (yz)z = le =2 


shows that z is regular in the ordinary sense and that 2! = y = z. 

The concept of maximality for two-sided ideals was introduced in 
Sec. 41. By analogy, we define a maximal left ideal in A to be a proper 
left ideal which is not properly contained in any other proper left ideal. 
A straightforward application of Zorn’s lemma shows that any proper 
left: ideal can be imbedded in a maximal left ideal; and since the zero 
ideal {0} is a proper left ideal, maximal left ideals certainly exist. We 
now define the radical R of A to be the intersection of all its maximal 
left ideals. It will be convenient to abbreviate this definition by writing 
R=f\MLI. R is clearly a proper left ideal. 

These ideas can be formulated just as easily for right ideals as for 
left ideals, and there is no reason for giving preference to either side over 
the other. The purpose of the following chain of lemmas is to show that 
B is also the intersection of all the maximal right ideals in A, that is, that 
R= OMRI. 


Lemma. Jf rts an element of R, then 1 — r is left regular. 
proor. We assume that 1 — r is left singular, so that 


L=A(i—r) = {x —ar:ze A} 


is a proper left ideal which contains 1 —r. We next imbed L in a 
maximal left ideal 7, which of course also contains 1 — r. Since r is in 
R, it is also in M, and therefore 1 = (1 — r) +risin M. This implies 
that M = A, which is a contradiction. 


Lemma. If r is an element of R, then 1 — r is regular. 

pPRooF. By the lemma just proved, there exists an element s such 
that s(1 — r) = 1, so s is right regular and s = 1 — (—s)r. The fact 
that R is a left ideal implies that (—s)r is in R along with r, and another 
application of the preceding lemma shows that 1 — (—s)r = s is left 
regular. Since s is both left regular and right regular, it is regular with 
inverse | — r, so 1 — r is also regular. 
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Lemma. If ris an element of R, then 1 — zr is regular for every x. 
prooF. R is a left ideal, so zr is in R and the statement follows from 
the lemma just proved. 


Lemma. If r is an element of A with the property that 1 — ar is regular 
for every x, then r is in R. 

PROOF. We assume that 7 is not in R, so that r is not in some maxi- 
mal left ideal M. It is easy to see that the set 


M+ Ar = {m+ar:méM andze A} 
is a left ideal which contains both M andr, so M + Ar = A and 
m+ear=1 


for some mand x. It now follows that 1 — zr = m is a regular element 
in M, and this is impossible, for no proper ideal can contain any regular 
element. 


The effect of these lemmas is to establish the equality of two sets: 
(\MLI = {r:1 — ar is regular for every x}. (1) 


Precisely the same arguments, when applied to maximal right ideals, 
show that 


MRI = {r:1 — rz is regular for every +}. (2) 


We now prove that all four of these sets are the same by showing that the 
two sets on the right of (1) and (2) are equal to one another. By sym- 
metry, it evidently suffices to prove the 

Lemma. Jf 1 — xr ts regular, then 1 — rz is also regular. 

PRooF. We assume that 1 — zr is regular with inverse 


s = (1 — ar)-h. 


This means, of course, that (1 — zr)s = s(1— zr) = 1. We leave 
it to the reader to show, by a simple computation, that 


(1 — rxz)(1 + rsx) = (1 + rsx)(1 — rx) = 1, 


so that 1 — rzisregular with inverse 1 + rsx. (The formula for (1 — rz)-} 
is less mysterious than it looks, as the reader can see by inspecting the 
meaningless but suggestive expressions 


s=(1—2r)'=1+ar4 (ar)?+°-- 


and 


C= ra)y = 1 + re + (rz)? + (rz)? + + ee = 1 + re + rare 
+ rerare pe = ee ar are > Je = 1 ree.) 
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We summarize our results in 


Theorem A. The radical R of A equals each of the four sets in (1) and (2) 
and ts therefore a proper two-sided ideal. 


A is said to be semi-simple if its radical equals the zero ideal {0}, 
that is, if each non-zero element of A is outside of some maximal left ideal. 

It will be observed that the ideas discussed above are purely algebraic 
in nature. They can be applied not only to our Banach algebra A, but 
also to any algebra or ring with identity. Our interest, however, is in A, 
and we now bring to bear upon these notions the results of Sec. 65, 
notably, the fact that the set S of all singular elements in A is closed. 

We begin by noting that if J is any ideal in A (left, right, or two- 
sided), then by the joint continuity of the algebraic operations, its 
closure [ is an ideal of the same kind. Next, since any proper ideal is 
contained in the proper closed set S, the closure of any proper ideal is a 
proper ideal of the same kind. It is an easy step from these facts to 


Theorem B. Every maximal left ideal in A is closed. 

proor. If any maximal left ideal Z is not closed, then ZL is a proper 
subset of the proper left ideal Z; and this cannot happen, for it contradicts 
the maximality of L. 


Taken together, the above two theorems yield 
Theorem C. The radical R of A ts a proper closed two-sided ideal. 
We shall also need 


Theorem D. Jf I is a proper closed two-sided ideal in A, then the quotient 
algebra A/I ts a Banach algebra. 


proor. Theorem 46-A tells us that A/J is a non-trivial complex Banach 
space with respect to the norm defined by 


lz + I] = inf {|x + 2l]:¢e¢ 7}. 
Further, A/IJ is clearly an algebra with identity 1 + 7, and 
[1 + Zl] = inf {1 + all:¢e 7} < fll] = 1. 
The multiplicative inequality for the norm is easily proved as follows: 


N+ Dyt+ Dl = |ley + Z| = inf {|lzy + all:te 7} 
< inf {| + )y + t2)|[:%, te € T} 
S inf {|[z + ail] lly + eal] sts, a2 € T} 
= [inf {|lz + a,||:¢1 € Z})finf {lly + cel] :22 ¢ T}] 
= |x + Tl lly + ZI. 
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All that remains is to show that |]1 + J|| = 1; and since we already have 
||1 + Z|| <1, this is an immediate consequence of the fact that 
[1 + Z|] = || + 23] < ||1 + Z|]? implies 1 < {1 + Z|]. 


As a final result, we state 


Theorem E. A/R is a semi-simple Banach algebra. 

PROOF. It suffices to observe that the natural homomorphism z — x + R 
of A onto A/R induces a one-to-one correspondence between the maxi- 
mal left ideals in A and those in A/R. 


In the following chapters, we shall be concerned almost exclusively 
with commutative Banach algebras. An algebra of this kind is of 
course much easier to handle than one which is not commutative, for all 
its ideals are two-sided and its radical is simply the intersection of its 
maximal ideals. Our reason for studying the general case here is that 
when it becomes necessary to assume commutativity, as it will in the 
next section, we want the force of this assumption, and the issues that 
depend on it, to be quite clear. 


CHAPTER THIRTEEN 


Che Structure of 


Commutative Banach Algebras 


The set C(X) of all bounded continuous complex functions defined 
on a topological space X is the simplest of the really interesting Banach 
algebras. Our purpose in this chapter is to prove the famous Gelfand- 
Neumark theorem, which says that every commutative Banach algebra A 
of a certain type is essentially identical with C(X) for a suitable compact 
Hausdorff space X. More precisely, we shall prove that a compact 
Hausdorff space X can be built out of the inner structure of A, that X is 
accompanied by a natural mapping of A into C(X), and that this mapping 
is one-to-one onto and preserves all the structure assumed to be present 
in A. 


70. THE GELFAND MAPPING 


Let A be an arbitrary commutative Banach algebra. Our first 
theorem below is the principal source of the structure theory of A, and 
the remainder of the chapter will be devoted entirely to shaping its 
consequences into the elegant form of the Gelfand-Neumark theorem. 


Theorem A. If M is a maximal ideal in A, then the Banach algebra A/M 
ts a division algebra, and therefore equals the Banach algebra C of complex 
numbers. The natural homomorphism x— x + M of A onto A/M =C€ 


assigns to each element x in A a complex number x(M) defined by 


2(M)=2+M, 
318 
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and the mapping x —» x(M) has the following properties: 

(1) @+y)(M) = 2(M%) + y(M); 

(2) (ax)(M) = ax(M); 

(3) (zy)(M) = z(M)y(™); 

(4) 2(M@) =0e2ceM; 

(5) 1(M@) =1; 

(6) |zx(M)| < |lz\l- 
PROOF. Theorems 69-B and 69-D tell us that A/M is indeed a 
Banach algebra. Since A contains an identity, M@ is maximal as a ring 
ideal (see the comments on this matter in Sec. 45); and therefore, by 
Theorem 41-C, A/M is a division algebra. We now appeal to Theorem 
67-B to conclude that A/M equals C. (Actually, of course, A/M equals 
the set of all scalar multiples of its own identity, but we identify this set 
with C in accordance with the remarks following Theorem 67-B.) Finally, 
properties (1) to (5) are obvious consequences of the nature of the homo- 
morphism under discussion, and (6) follows from 


|x(M)| = |x + M| = |x + M|| = inf {lle + ml|:me M} < [al]. 


It is interesting to observe that this proof depends, either directly 
or indirectly, on virtually every major theorem in the previous chapter. 
We also note that the ultimate reason for assuming that A is commutative 
lies in Theorem 41-A, which is definitely not true in the non-commutative 
case (see Problem 41-1). 

The language of Theorem A is oriented toward the idea that 2(M) 
is a function of x for each fixed M. The notation, however, suggests that 
we reverse this point of view and that for each fixed x we regard 7(M) as 
a complex function defined on the set SW of all maximal ideals in A. 
This is the direction in which we now proceed. 

If x is a given element of A, we denote by £ the function defined on 
gm by £(M) = z(M), and we put A = {4:24¢€ A}. Our next step is to 
define a topology for 9% in such a manner that every function in A is 
continuous. The most natural way of doing this is to introduce the weak 
topology generated by A. It will be recalled that this is the weakest 
topology on 3M relative to which every function £ is continuous and that a 
typical subbasic open set has the form 

S(@, Mo, -) = {M:M ¢9m and |£(M) — £(Mo)| < e}. 
We call the topological space 9% the space of maximal ideals, or the 
maximal ideal space, and the mapping x — # of A onto A’ will be referred 
to as the Gelfand mapping. 

We are now in a position to reformulate Theorem A, and to extend 
it, in such a way that the Gelfand mapping is displayed as the object of 
central importance. 
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Theorem B. The Gelfand mapping x-—># is a norm-decreasing (and 
therefore continuous) homomorphism of A into C(I) with the following 
properties: 

(1) the image A of A is a subalgebra of C(I) which separates the 
points of 3M and contains the identity of C(S1); 

(2) the radical R of A equals the set of all elements x for which # = 0, 
so x—» £is an isomorphism = A is semi-simple; 

(3) an element x in A is regular = it does not belong to any maximal 
ideal = £(M) ¥ 0 for every M; 

(4) if x ts an element of A, then its spectrum equals the range of the 
function £ and its spectral radius equals the norm of #, that is, o(x) = £(31) 
and r(x) = sup |#(M)| = ||2\|. 
proor. The definition of the topology on 9M guarantees that each 
function #@ is continuous, and part (6) of Theorem A shows that ¢ is 
bounded and that ||£|| = sup |2(41)| < ||z||,so z > @is a norm-decreasing 
mapping of A into C(9%). The fact that this mapping is a homomorphism 
is immediate from parts (1), (2), and (3) of Theorem A. 

Since —> # is a homomorphism, A is obviously a subalgebra of 
e@(o). The stated properties of A follow readily from parts (4) and (5) 
of Theorem A: if M, ¥ Mz, and if (say) z is in M, but not in M2, then 
£(M;) = 0 and £(M.) ~ 0; and 1(M) = 1 for every M. 

If we recall that R is the intersection of all the M’s, then the proof of 
(2) is easy: we have only to notice that part (4) of Theorem A tells us 
that £(M) = 0 for every M = z is in every M. 

To prove (8), it suffices—in view of part (4) of Theorem A—to show 
that x is regular = it does not belong to any M. It is elementary that a 
regular element cannot lie in any proper ideal, so we confine our attention 
to showing that if z is singular, then it does belong tosome M@. We prove 
this by observing that the singularity of x implies that Ax = {yx:y¢ A} 
is a proper ideal which contains z and can therefore be imbedded in a 
maximal ideal M which also contains z. 

Finally, we use (3) to prove (4). By the definition of the spectrum of 


xz, we have A €a(z) @ x — Al is singular (¢— d1)(M) = 0 for at least 
one M = (4 — 41)(M) = 0 for at least one M = £(M) = » for at least 
one M, so o(x) equals the range of #. The rest of (4) follows from this 
statement and the definition of the spectral radius. 


We add the final touch to this portion of the theory by showing that 
9 is a compact Hausdorff space. The reader will recall that if A* is 
the conjugate space of A, then its closed unit sphere 


S* = {f:fe A* and ||fl| < 1} 


is a compact Hausdorff space in the weak* topology (see Theorem 49-A). 
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Our strategy is to identify 97, both as a set and as a topological space, 
with a closed subspace of S*. 

A multiplicative functional on A is a functional f in the ordinary sense 
—that is, an element of the conjugate space A*——which is non-zero and 
satisfies the additional condition f(y) = f(x)f(y). Theorem A shows 
that to each M in St there corresponds a multiplicative functional fy 
defined by fu(z) = x(M). Itis important for us to know that M — fy is 
a one-to-one mapping of SW onto the set of all multiplicative functionals. 
It will facilitate our work if we begin by proving the 


Lemma. [If f, and f2 are multiplicative functionals on A with the same null 
space M, then fi = fo. 

PRooF. We first show that f: = af2 for some scalar a. Let xo be 
an element of A which is notin M. If zis an arbitrary element of A, it is 
easy to see that x can be expressed uniquely in the form x = m + Bao 
with m in M (set B = fe(x)/fe(zo), put m = x — Bxo, and observe that 
fo(m) = 0). It now follows that 


Six) = film) + Bfi(to) = Bfi(xo) = [f1(eo0)/fo(to)]fo(a), 


so fi = af, with a = f;(z0)/fe(ro). We complete the proof by showing 
that a equals 1. Let x be an element not in M, so that fo(r) #0. Then 
ofe(xz)? = of2(x?) = fi(z?) = filx)? = [afo(x)]? = a*f2(x)? implies that 


a? = a, 
soa =Qora=1. Since fi ¥ 0, we conclude that a = 1. 
We now use this to prove 


Theorem C. M — fy is a one-to-one mapping of the set Mm of all maximal 
ideals in A onto the set of all tts multiplicative functionals. 

proor. The mapping is easily seen to be one-to-one, for if Mi # M2, 
and if (say) z is in M, and not in M2, then fy,(z) = 0 and fu,(x) # 0. 
To prove that it is onto, let f be an arbitrary multiplicative functional, 
and consider its null space M = {z:f(z) = 0}. It is clear by the 
assumed properties of f that M is a proper closed ideal in A. Further- 
more, M is maximal, for if it were properly contained in a proper ideal 7, 
then f(J) would be a non-trivial ideal in C, contrary to Theorem 41-A. 
Since f and fw are multiplicative functionals with the same null space, 
the lemma just proved implies that f = fy, and our proof is complete. 


In some of its more concrete applications, this theorem is used to 
replace the algebraic problem of determining the maximal ideals in A by 
the analytic problem of finding its multiplicative functionals. Its 
importance for our current task of showing that 9M is a compact Hausdorff 
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space is that it enables us to regard SM as a subset of A*. We can say 
even more than this, for parts (5) and (6) of Theorem A tell us that 
every multiplicative functional fy has norm 1, so SM is a subset of the 
closed unit sphere S*. We recalled earlier that S* is a compact Haus- 
dorff space with respect to the weak* topology, which is (see Sec. 49) 
the weak topology generated by all the functions F, defined on S* by 
F.(f) = f(z). We now observe that when F, is restricted to SM, it is 
precisely 2, for 
F.Afu) = fu(z) = x(M) = £(M). 


Therefore, by Problem 19-lc, the topology which 1 has as a subspace of 
S* is exactly its topology as the space of maximal ideals. These con- 
siderations permit us to regard IW as a subspace of S*. 


Theorem D. The maximal ideal space SM is a compact Hausdorff space. 
proor. In view of the above discussion, it suffices to show that SM is 


a closed subspace of S*. We accomplish this by forming the subspace 
X of S* defined by 


X = Oza {f:f € S* and f(zy) = f@)f(y)}. 


It is evident that X is simply 9” together with the zero functional; and 
since we have 


X = Mzma {f:f & S* and f(zy) — f(x) f(y) = 0} 
= M\zya {f:f € S* and F.,(f)— F(f)F,(f) = 0} 
= Mzya {f:f € S* and (F., — F;F,)(f) = 0}, 


it is easy to see that X is closed in S* (note that each of the sets last 
written has this property). We next remark that F; is continuous on X 
and equals 1 on 9% and 0 at the zero functional. It follows from this that 
gm is closed in X and is therefore closed in S*. 


It is worthy of notice that the topology we imposed on 9M is the only 
one which makes it into a compact Hausdorff space on which all the 
functions ¢ are continuous, for by Theorem 26-E, any stronger compact 
Hausdorff topology must equal the given one. 

When Theorems B and D are taken together, the result is often 
called the Gelfand representation theorem. In essence, this tells us that 
every commutative semi-simple Banach algebra is isomorphic to an 
algebra of continuous complex functions on a suitable compact Hausdorff 
space. In general, the norm is not preserved by this isomorphism and 
the representing algebra does not exhaust the continuous functions on the 
underlying space. We shall remove these deficiencies in the following 
sections by assuming that additional structure is present in the Banach 
algebra under discussion. 
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71. APPLICATIONS OF THE FORMULA r(x) = lim |[2*||/" 


We continue our study of an arbitrary commutative Banach algebra 
A and of the Gelfand mapping z— 4 of A onto the subalgebra A of 
e(9). Our first theorem provides a simple way of guaranteeing that 
this mapping preserves norms. 


TheoremA. The following conditions on A are all equivalent to one another: 
(1) |lx?l] = [lall? for every x; 
(2) r(x) = ||x|| for every x; 
(3) ||£l] = |lz\| for every x. 

PROOF. It follows from condition (1) that 


H]ac‘|l = |e)? = [I]? = Ilel* 


and, in general, that ||z™|| = ||z||* for every positive integer k. The 
formula for the spectral radius now yields 


r(x) = lim |lx*||/* = lim ||x*||* = lim ||z]] = |[<f, 


so (1) implies (2). The fact that (2) implies (1) is immediate from 
||z?|| = r(x?) = r(x)? = ||z||2. In view of the equation r(x) = ||£|| (see 
Theorem 70-B), the equivalence of (2) and (8) is obvious. 


Our next problem is to devise a way of making sure that the repre- 
senting algebra A comes as close as it can to exhausting @(9%), and we 
accomplish this by introducing the following property. A is said to be 
self-adjoint if for each x in A there exists an element y in A such that 
g(M) = £(M) for every M. 


Theorem B. If A is self-adjoint, then A is dense in @(M). 

PRooF. By part (1) of Theorem 70-B, we know that A is a sub- 
algebra of @(9%) which separates the points of 9% and contains the 
identity function. Problem 20-3 and our hypothesis now tell us that the 
closure of A is a closed subalgebra of €(9%) which separates points, con- 
tains the identity function, and contains the conjugate of each of its 
functions. Theorem 36-B (the complex Stone-Weierstrass theorem) 
shows that this closure equals C(I), so A itself is dense in C(9M). 


If we put together the results obtained in the above two theorems, 
we have 


Theorem C. If A ts self-adjoint, and if ||x?|] = ||x||? for every xz, then the 
Gelfand mapping x — £ is an isometric isomorphism of A onto C(I). 


PROOF. By Theorem A, the mapping x— #@ preserves norms. It is 
therefore an isometric isomorphism of A onto A, and we see from this 
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that A is closed in @(9N). Since A is dense in C(3) by Theorem B, it 
follows that A equals (51), and the proof is complete. 


This theorem lacks a certain simplicity which it ought to have, for 
the condition of self-adjointness is rather far removed from the intrinsic 
structure of A. Our work in the next two sections will remedy this defect 
and at the same time will establish closer connections with the operator 
algebras to which we apply our final result. 


72. INVOLUTIONS IN BANACH ALGEBRAS 


A Banach algebra A is called a Banach *-algebra if it has an involution, 
that is, if there exists a mapping z — x* of A into itself with the following 
properties: 

(1) @t+y)* =2* +y%; 

(2) (ax)* = az*; 

(3) (zy)* = y*2*; 

(4) «*¥* = 2. 

It is an easy consequence of (4) that the involution x — 2* is actually a 
one-to-one mapping of A onto itself. We also note that 0* = 0 and 
1* = 1, as we see from 0+ 2* = 2* = (0+ 2)* = 0*+2* and 
1* = 11* = 1**1* = (11*)* = (1*)* = 1** = 1. The element z* is 
called the adjoint of x, and a subalgebra of A is said to be self-adjoint if it 
contains the adjoint of each of its elements. If A’ is also a Banach 
*-algebra, and if f is an isomorphism of A onto A’, then f is called a 
*_{somorphism if it preserves the involution in the sense that f(z*) = f(x)*. 

We naturally want the involution in a Banach *-algebra to be linked 
in some useful way to the norm. The property ||z*{] = ||z|| clearly 
implies that the involution is continuous; for if z, > z, then 


Iltn* — 2*l| = ||(@n — 2)*l| = |lza — all 


shows that z,* — x*. A much stronger relation between the involution 
and the norm is given by the condition 


llx*z|| = lle, 


and any Banach *-algebra which satisfies it is called a B*-algebra. It is 


easy to see that we have ||z*|| = ||z|| in every B*-algebra; for 

\[zl|? = llz*al] < lz" [lel 
shows that |{z|| < ||z*|| for every z, so ||z*|| < ||z**|| = ||z|], and therefore 
i|x*|| = ||z|]. It follows from this that the relation ||z*z|| = ||z*|| ||z|| is 
also true. 


Several of the Banach algebras described in Sec. 64 are also Banach 
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*-algebras with respect to natural involutions. If X is any topological 
space, then C(X) is clearly a commutative B*-algebra relative to the 
involution defined by f*(z) = f(z). The disc algebra, however, is not, 
for if f is the function defined by f(z) = z, then f*(z) = 2 is not analytic 
at any point. If H is a non-trivial Hilbert space, then @(H) is a B*- 
algebra with the adjoint operation T — T* taken as the involution (see 
Theorem 56-A). Since C*-algebras are the self-adjoint Banach sub- 
algebras of @(H)’s, they too are B*-algebras. Finally, the group 
algebra L,(G) of a finite group G is a Banach *-algebra with respect to the 
involution defined by f*(g;) = f(g), and it is easy to see that || f*|| = || Il. 

It should be reasonably clear that Banach *-algebras (and espe- 
cially B*-algebras) are modeled along lines suggested by @(H). We have 
already called the element x* in such an algebra the adjoint of x. By 
analogy, we say that z is self-adjoint if x = x*, normal if xx* = x*z, anda 
projection if x = x* and x? = zx. 


Theorem A. Jf x is a normal element in a B*-algebra, then ||x?|| = ||2x|l?. 
proor. It is obvious that ||z?|| < ||z||?. The inequality in the other 
direction is a consequence of the following computation: 


lle*ll?|lal[? = ({l2*l] lel)? = late? = ||@*x)*2*al| = ||e*ar*zl 
= ||z*2*zx|| = ||(e*)?2?|| = ||(@*)*27|| = ||(@)* ll [ll 
= [1@*)?Il lle*ll < Mell lla". 


This result suggests more strongly than ever that there are close 
connections between B*-algebras and algebras of operators on Hilbert 
spaces (see Theorem 58-D). We describe the true state of affairs in this 
matter at the end of the next section. 


73. THE GELFAND-NEUMARK THEOREM 


We are now in a position to give Theorem 71-C its final form. 


Theorem A. If A ts a commutative B*-algebra, then the Gelfand mapping 
x2 — £ is an isometric *-isomorphism of A onto the commutative B*-algebra 
e(I). 
prooF. Since A is commutative, each of its elements is normal, and 
it follows from Theorem 72-A that ||x?|| = ||z|/?for every x. By Theorem 
71-C, it now suffices to show that z*(M) = Z(M) for each xz in A and M 
in OW. 

Our first step is to prove that if x is self-adjoint, then #(J/) is real 
for every M. We assume the contrary, namely, that there exists an M 
such that #@(M) = a + 78 with 6 #0. Since z is self-adjoint, 


y = (x — al)/8 
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is also self-adjoint. We further note that 9(M) = 7, so y — il isin M. 
It is obvious from the properties of the involution in A that 


M* = {m*:me M} 


is a maximal ideal; and since it contains (y — 71)* = y + 71, we see that 
g(M*) = —1. If K is any positive number, then 


(y= aK1)(M*) = 41 + K) 


aS 
and (y + 1K1)(M) =i(1+ K). It follows from this that 1+ K < 


am 
ly — ¢K1|| < lly — tK1|| and, similarly, that 1 + K < ||y + 7Kil]. On 
multiplying these two inequalities, we obtain 


(1 + K)? < |ly — iK1|| ly + ¢K1]] = ||(y + K1)*|| ly + 2K1|| 
= ||(y + iK1)*(y + 7K1)|| = ||(y — 1K1)(y + iK1)| 
= lly? + K*1]| < |ly?|| + K, 


so 1 + 2K < |ly?||. Since K is arbitrary, this is impossible, and this 
portion of our proof is complete. 

We now conclude the proof by showing that if z is any element of A, 
then z*(M) = £(M) for every M. It is clear that y = (x + 2*)/2 and 
2 = (x — x*)/(22) are self-adjoint, and that z = y + iz; and therefore, 
by the result of the above paragraph, we have 


2 (M) = G— %)(M) = 9(M) — &(M) = FM) — AG) 
= I) + AM) = FM). 


We already know that if X is any compact Hausdorff space, then 
€(X) is a commutative B*-algebra. The theorem just proved—it is 
called the Gelfand-Neumark representation theorem—tells us that commu- 
tative B*-algebras are simply abstract C(X)’s, in the sense that every 
such algebra is abstractly identical with @(X) for a suitable compact 
Hausdorff space X. 

There is another Gelfand-Neumark theorem of great interest, 
which applies to arbitrary B*-algebras. We observed in the previous 
section that every C*-algebra is a B*-algebra. The converse of this is 
also true, for if A is a B*-algebra, then there exists a Hilbert space H 
with the property that A is isometrically *-isomorphic to a C*-algebra 
of operators on H.' General B*-algebras are therefore abstract C*- 
algebras. The proof of this theorem evidently requires that a suitable 
Hilbert space be constructed out of the given structure of A. The 
details of this construction are beyond the scope of this book, and we 
content ourselves with merely stating the facts. 


1 See Rickart (34, p. 244]. 


CHAPTER FOURTEEN 


Some Special 


Commutative Banach Algebras 


Our discussion in Sec. 63 foreshadowed a generalized form of the 
spectral theorem, and the principal purpose of the present chapter is to 
formulate and prove this result. We begin with some additional material 
relating to Banach algebras of continuous functions. In particular, we 
keep the promise made in Sec. 30 by showing that the Stone-Cech 
compactification of a completely regular space is essentially unique. 


74. IDEALS IN @(X) AND THE BANACH-STONE THEOREM 


Let X be a compact Hausdorff space, and consider the commutative 
B*-algebra C(X). If 9 is the space of maximal ideals in C(X), then the 
developments of the previous chapter lead us to expect that SW can be 
identified with X and that the Gelfand mapping is the identity mapping 
of C(X) onto itself. 

In order to substantiate this conjecture, we begin by observing 
that to each point z in X there corresponds a proper ideal M, in @(X), 
defined by 

Mz = {f:f ¢e(X) and f(x) = 0}. 


M, is easily seen to be maximal and is thus an element of MM, for it is the 
null space of the multiplicative functional fy, defined by fy,(f) = f(z), 
which assigns to each function in C(X) its value atx. Since X is compact 
Hausdorff, and therefore normal, Urysohn’s lemma tells us that for each 
point y * x there exists a function f in C(X) such that f(z) = 0 and 
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f(y) #0. This shows that z— M, is a one-to-one mapping of X into 
mm. Our next step is to prove that this mapping is onto, and for this it 
clearly suffices to show that if 1 is any maximal ideal in C(X), then there 
exists a point in X at which every function in M vanishes. We assume 
the contrary, namely, that for each point x in X there exists a function 
fin M such that f(x) #0. Since f is continuous, x has a neighborhood 
at no point of which f vanishes. We now vary x to obtain an open cover 
for X, and we use compactness to infer that this open cover has a finite 
subcover. Let fi, fe, ..., fn be the corresponding functions in M. 
M isan ideal, so the function g = 22, f,f; = 22. |f,|? is also in M; and by 
the manner of its construction, it clearly has the property that g(x) > 0 
for every xz. It follows that g is a regular element of C(X), and this 
contradicts the fact that it lies in the proper ideal M. We therefore 
conclude that x — M, is a one-to-one mapping of X onto IM. 

These considerations enable us to identify the set 3 with the set X, 
and in terms of this identification, we regard X and SM as two possibly 
different compact Hausdorff spaces built on the same underlying set of 
points. By our work in the previous chapter, we know that the Gelfand 
mapping f — f is a one-to-one mapping of C(X) onto @(9m). If we use 
the notation established there, then we find that 


f(Mz) = f(Mz) = fu.(f) =f), 


sof = fand C(t) = C(X). We now recall that any compact Hausdorff 

topology on a non-empty set is uniquely determined as the weak topology 

generated by the set of all its continuous complex functions (see Problem 

27-3). It follows from this that X and 9M are equal as topological spaces. 
We summarize the results of this discussion in 


Theorem A. Let X be a compact Hausdorff space and IM the space of 
maximal ideals in the commutative B*-algebra C(X). Then to each point 
xin X there corresponds a maximal ideal M, defined by 


M, = {f:f € C(X) and f(z) = 0}, 


and x — M, is a one-to-one mapping of X ontoM. If this mapping is used 
to identify I with X, then M and X are equal as topological spaces, C(I) 
equals €(X), and the Gelfand mapping f — f is the identity mapping of C(X) 
onto itself. 


The main idea of this theorem is that the maximal ideals in C(X) 
correspond in a natural way to the points of X. Our next step is to 
extend this idea and to obtain a similar characterization of the proper 
closed ideals in C(X). 

We again consider a compact Hausdorff space X, and we begin our 
discussion with the observation that to each non-empty closed subset F 
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of X there corresponds a proper closed ideal [(F) in @(X), defined by 
I(F) = {f:f ¢ C(X) and f(F) = 0}. 


If x is any point not in F, then it follows from the complete regularity of 
X that there exists a function f in C(X) such that f(z) ¥ 0 and f(F) = 0. 
This shows that F — I(F) is a one-to-one mapping of the class of all non- 
empty closed subsets of X into the set of all proper closed ideals in C(X). 
We shall prove that this mapping is onto, that is, that every proper closed 
ideal in C(X) arises in this way from some F. 

Let I be a proper closed ideal in C(X). We may assume that I is 
not the zero ideal, for this ideal clearly arises from the full space X. 
We define F by 


F = {x:f(x) = 0 for every fel}. 


It is easy to see that F is a proper closed subset of X; and since J is con- 
tained in some maximal ideal, it follows that F is non-empty. Our 
task is to prove that J(F) = J, and since it is obvious that J C I(F), 
the real problem is to prove that J(F) CI. If f is any function which 
vanishes on F, we must show that flies in 7. We may evidently assume 
that f ~ 0, so that {z:f(r) = 0} is a proper subset of X. 

In the first part of our proof, we assume that f vanishes on some open 
set G which contains F. Since f # 0, G’ is non-empty and is thus a 
compact subspace of X. For each point z in G’, there exists a function 
gin I such that g(r) # 0. The technique used in the proof of Theorem A 
can now be applied again, to yield a finite number of functions gi, go, 
. .., Qn in I with the property that at least one is non-zero at every 
point of G’. We next define a function go by go = 22.1 gg; = 221 |g, 
and we observe that go is in J and that go(x) > 0 for every xin G’. By 
the Tietze extension theorem, the function whose values on G’ are given 
by 1/go(z) can be extended to a function hin C(X). It is easily seen that 
goh is in J, that it equals 1 on G’, and that f = fgoh, so f is in I. 

We now turn to the general case. For each e > 0, the sets K and L 
defined by K = {x:|f(x)| < €/2} and L = {zx:|f(x)| > e} are disjoint 
closed subsets of X. K is clearly non-empty, and since f # 0, L is also 
non-empty for every sufficiently small «. We assume that ¢« has been 
chosen at least this small, so that K and L constitute a disjoint pair of 
closed subspaces of X. By Urysohn’s lemma, there exists a function 
g in C(X) such that g(K) = 0, g(L) = 1, and 0 < g(z) < 1 for every zx. 
We now define a function h in C(X) by h = fg, and we note that 


lf — All = If - gl Se. 


It is evident that h vanishes on the set G = {x:|f(x)| < ¢/3}; and since 
G is an open set which contains F, it follows from the preceding para- 
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graph that hisin J. This shows that for every sufficiently small positive 
number « there exists a function A in J such that ||f — h|| < ¢, and since 
I is closed, we conclude that f is in J. 

We give the following formal statement of our result. 


Theorem B. Let X be a compact Hausdorff space. Then to each non- 
empty closed set F in X there corresponds a proper closed ideal I(F) in 
C(X), defined by I(F) = {f:f € C(X) and f(F) = 0}; and further, F > I(F) 
is a one-to-one mapping of the class of all non-empty closed subsets of X onto 
the set of all proper closed ideals in C(X). 


As an easy consequence of this, we have 


Theorem C. If X is a compact Hausdorff space, then every closed ideal 
in C(X) is the intersection of the maximal ideals which contain it. 

PROOF. Since the intersection of the empty set of maximal ideals is 
@(X) itself, we may confine our attention to a proper closed ideal J. 
By Theorem B, J = I(F) for some non-empty closed set F. It is clear 
that the maximal ideals which contain J are precisely those associated 
with the points of F. It therefore suffices to observe that a function in 
@(X) vanishes on F © it vanishes at each point of F. 


We have seen in Theorem A that the points and the topology of a 
compact Hausdorff space X can be recovered from the maximal ideals in 
@(X). Since the maximal ideals in C(X) are objects of a purely alge- 
braic nature, it follows that the compact Hausdorff space X is fully 
determined, both as a set and as a topological space, by the algebraic 
structure of C(X). These observations lead us directly to 


Theorem D (the Banach-Stone Theorem). Two compact Hausdorff spaces 
X and Y are homeomorphic = their corresponding function algebras C(X) 
and C(Y) are isomorphic. 


75. THE STONE-CECH COMPACTIFICATION (continued) 


It is natural to wonder what can be said along the lines of Theorem 
74-A in the case of a topological space X which is not necessarily compact 
Hausdorff. Regardless of the properties of X, we know from our pre- 
vious work that @(X) is a commutative B*-algebra, that its maximal 
ideal space 9M is a compact Hausdorff space, and that r— M, is a 
mapping of X into 9%. Our difficulty is that without restrictions of some 
kind on X, we know practically nothing about the properties of the map- 
ping z— M,. If it happens that this mapping is one-to-one and is also 
a homeomorphism of X onto a subspace of 9M, then we observe that X 


Some Special Commutative Banach Algebras 331 


is necessarily completely regular. It is therefore reasonable to assume at 
the outset that X is completely regular, and we shall see that several 
interesting conclusions follow from this hypothesis. 


Theorem A. Let X be a completely regular space and WM the space of 
maximal ideals in the commutative B*-algebra C(X). Then the mapping 
x — M, is a homeomorphism of X onto a subspace of M. Furthermore, if 
this mapping is used to identify X with its image in IM, then (1) X ts a 
dense subspace of SM; (2) each function in C(X) has a unique extension to a 
function in @(9); and (3) if Y ts a compact Hausdorff space with the 
properties of IM stated in (1) and (2), then there exists a homeomorphism of 
MM onto Y which leaves the points of X fixed. 

proor. The fact that +— M, is one-to-one is immediate from the 
complete regularity of X, so we may identify X as a set with its image in 
yr. The subset X of 9 has two topologies: its own, and its relative 
topology as a subspace of 9. The following arguments show that these 
topologies are equal. We know that the Gelfand mapping f— f is an 
isomorphism of C(X) onto C(9). Also, just as in the proof of Theorem 
74-A, we have f(x) = f(x) for each f in C(X) and each z in X. These 
observations imply that C(X) is precisely the set of all restrictions to X 
of functions in ©(91); and since both topologies are completely regular, 
it follows from Problems 19-lc and 27-4 that each is the weak topology 
generated by C(X), so they are equal and X can be regarded as a subspace 
of 9%. These observations also show that X is dense in svt—for if f 
vanishes on X, then f = 0, f = 0, and f also vanishes on 9N—and that 
each function f in C(X) has a unique extension f in (9%). All that 
remains is to prove (3). We know that f—f is an isomorphism of 
(3) onto C(X); and by the assumptions about Y, the mapping f — f’, 
which assigns to each f in C(X) its extension f’ in C(Y), is an isomorphism 
of €(X) onto C(Y). Thus f—f—f’ is an isomorphism of C(5%) onto 
e(Y). If zx is a point of X, then this isomorphism clearly carries the 
maximal ideal in C(91) corresponding to z over to the maximal ideal in 
€(Y) corresponding to z; so by the Banach-Stone theorem, it induces a 
homeomorphism of 3% onto Y which leaves the points of X fixed. 


On comparing this result with Theorem 30-A, we see that 91% is 
homeomorphic, in the manner described, to the Stone-Cech compacti- 
fication 6(X). In this sense, therefore, 9% and 6(X) can be considered 
equal to one another, and also to any other compact Hausdorff space 
which contains X as a dense subspace and has the required extension 
property. In effect, we have shown that the Stone-Cech compactifica- 
tion of a completely reguiar space X is unique and can equally well be 
regarded as the maximal ideal space of C(X). 
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76. COMMUTATIVE C*-ALGEBRAS 


In this final section, we apply the results of the preceding two chap- 
ters to the theory of operators on a non-trivial Hilbert space H. We 
know that @®(H) and all its self-adjoint Banach subalgebras (that is, all 
C*-algebras of operators on H) are B*-algebras. As a special case of the 
Gelfand-Neumark theorem, we therefore have 


Theorem A. Let A be a commutative C*-algebra of operators on H, and 
M its space of maximal ideals. Then the Gelfand mapping T — T is an 
isometric *-isomorphism of A onto C(I). 


If {7;} is a non-empty set of operators on H, then the smallest 
Banach subalgebra of ®(H) which contains every 7; is called the Banach 
subalgebra of @(H) generated by the T,’s. It is easy to see that this 
Banach subalgebra of @(H) is the closure of the set of all polynomials 
in the 7,’s. If N is a normal operator on H, then the Banach sub- 
algebra of @(H) generated by N and N* is clearly a commutative C*- 
algebra, and is called the commutative C*-algebra generated by N. Wenow 
specialize Theorem A to 


Theorem B. Let N be a normal operator on H, and A the commutative 
C*-algebra generated by N. If Mts the space of maximal ideals in A, then 
the Gelfand mapping T — T is an isometric *-isomorphism of A onto C(I). 


As it stands, this result is only a beginning. In order to exploit it 
effectively, our first task is to show that the spectrum of an operator in 
A—which is understood to be its spectrum as an element of @(H)—equals 
its spectrum as an element of A. In proving this, we shall need the 
following preliminary fact. 


Lemma. Let X be a compact Hausdorff space and A a Banach subalgebra 

of C(X). If fis areal function in A which is regular in C(X), then it ts 

also regular in A. 

pRooF. The range of f is clearly a compact subspace of the real line. 
which does not contain 0. If « > 0 is given, then by the Weierstrass 

approximation theorem (see Problem 35-3) there exists a polynomial 

p such that |p(t) — 1/¢| < efor every tinf(X). It follows from this that 

\p(f(z)) — 1/f(z)| < for every x, so ||p(f) — 1/fl| <«. Since p(f) is 

in A and A is closed, we conclude that 1/f is in A. 


Theorem C. Let A be a commutative C*-algebra of operators on H. If 
an operator T in A is regular in @(H), then it ts also regular in A, and 
therefore the spectrum of T as an operator on H equals its spectrum as an 
element of A. 
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PRoor. We begin by considering the special case in which T is 
assumed to be self-adjoint. Let B be the Banach subalgebra of @(H) 
generated by 7 and T-'. Since T and 7-' are self-adjoint and commute 
with one another, it is evident that B is a commutative C*-algebra; and if 
9M is its space of maximal ideals, then B is isometrically *-isomorphic to 
C(I) and T' is represented by a real function in C(M). C= AM Bis 
a Banach subalgebra of B and is therefore isomorphic to a Banach 
subalgebra of C(9%). Since T is in C and is regular in B, our lemma 
shows that 7—! is also in C and therefore lies in A. 

We now turn to the general case, in which T is not assumed to be 
self-adjoint. Itisclear that U = T7* isa self-adjoint operator in A, and 
since it has an inverse U~) = (TT*)- = (7*)7T" = (T-)*T— in 
@(H), we know from the preceding paragraph that U-isin A. We now 
make use of the commutativity of A to write the relation UU-' = J in 
the form T(T*U-) = (T*U-!)T = I. This shows that T-! = T*U—, 
so T~! lies in A and the proof is complete. 


This result tells us, in particular, that if N is a normal operator on H, 
then its spectrum o(N) equals its spectrum as an element of the commuta- 
tive C*-algebra generated by N. Our next step is to provide a concrete 
representation for the space of maximal ideals in this algebra. 


Theorem D, Let N be a normal operator on H, A the commutative C*-alge- 
bra generated by N, and 9M the space of maximal ideals in A. Then the 
function N in (3m) which corresponds to N under the Gelfand mapping 
ts a homeomorphism of M onto o(N). 

proor. It follows from Theorem C and part (4) of Theorem 70-B that 
o(N) is precisely the range of the continuous function N defined on MM. 
Since both 9% and o(N) are compact Hausdorff spaces, it suffices by 
Theorem 26-E to show that N is one-to-one. Let M; and M; be points 
of 9 such that 1 (M1) = N(M;). Then we also have 


N*(M,) = N(M)) = N(M) = N*(M,), 


so each of the functions N and N* takes equal values at M, and M2. 
Since A is the closure of the set of all polynomials in N and N*, every 
function in @ (9%) is a uniform limit of polynomials in N and nN, and 
therefore every function in C(91) takes equal values at M@, and M2. We 
conclude the proof by observing that since (SI) separates the points 
of 1, it follows that Mi = M2. 


In accordance with this result, we may identify 9% with the compact 
subspace o(N) of the complex plane; and when this identification is 
carried out, it is easy to see that N(z) = z for every zinm. Wesum- 
marize our conclusions in 
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Theorem E, Let N be a normal operator on H with spectrum o(N), and 
let A be the commutative C*-algebra generated by N. Then the space IM 
of maximal ideals in A equals o(N), and the Gelfand mapping T > T of 
A onto C(I) ts an isometric *-isomorphism which carries N into the func- 
tion whose values are given by N(z) = z for every z in M. 


This theorem has a number of simple consequences, of which the 
following are only a few: (1) N =0<—o(N) = {0}; (2) N is singular 
<= o(N) contains 0; (3) o(N*) = o(WV); (4) N is self-adjoint + o(N) lies on 
the real line; (5) N is unitary = o(N) C {z:|z| = 1}; (6) N isa projection 
& o(N) C {0,1}. If it happens that H is finite-dimensional, so that 
o(N) consists of a finite number of distinct complex numbers A;, Az, 
. . « ) Am, then we can write 


N = DP, 


where P, is the function in (91) defined by Pi) = 6,;. It is evident 
from this that 


N -> MP, 


where the P,’s are non-zero pairwise orthogonal projectionsin A such that 
27, P: =I. This is precisely the spectral resolution of N treated in 
Chap. 11, so Theorem E actually contains the finite-dimensional spectral 
theorem. We therefore have solid grounds for regarding Theorem E 
as the generalized form of the spectral theorem discussed in the last par- 
agraph of Sec. 63, and all our promises are fulfilled. 
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APPENDIX ONE 


Fixed Point Cheorems 


and Some Applications to Analysis 


Let f be a continuous mapping of the closed interval [—1,1] into 
itself. Figure 39 suggests that the graph of f must touch or cross the 
indicated diagonal, or more precisely, that there must exist a point x» in 


[—1,1] with the property that f(r») = zo. The proof is easy. We con- 
sider the continuous function F defined on [—1,1] by F(x) = f(x) — 2, 
and we observe that F(—1) >0 and that F(1) < 0. It now follows 
from the Weierstrass intermediate value theorem (see Theorem 31-C 
and the introduction to Chap. 6) that there exists a point z» in [—1,1] 
such that F(z9) = 0 or f(t0) = Zo. 
It is convenient to describe this phenomenon by means of the follow- 
ing terminology. A topological space X is called a fixed point space if 
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every continuous mapping f of X into itself has a fixed point, in the sense 
that f(zo) = xo for some x) in X. The remarks in the above paragraph 
show that [—1,1] is a fixed point space. Furthermore, the closed disc 
{(x,y):2? + y? < 1} in the Euclidean plane R? is also a fixed point space 
(for a lucid elementary proof of this, see Courant and Robbins [6, pp. 251- 
255]). Both of these facts are special cases of 


Brouwer's Fixed Point Theorem. The closed unit sphere S = {x:||z]] < 1} 
in R* is a fixed point space. 


There are several proofs of this classic result, but since they all 
depend on the methods of algebraic topology, we refer the reader to Bers 
(3, p. 86]. Brouwer’s theorem itself is a special case of 


Schauder’s Fixed Point Theorem. Every convex compact subspace of a 
Banach space is a fixed point space. 


For a proof, together with a discussion of other related results, see 
Bers [3, pp. 98-97]. Schauder’s theorem was foreshadowed by the work 
of Birkhoff and Kellogg [5] on existence theorems in analysis. We 
illustrate the relevance of these ideas to such problems by giving a full 
treatment of Picard’s theorem on the existence and uniqueness of solu- 
tions of first order differential equations. 

We begin by considering an arbitrary metric space X with metric d. 
A mapping T of X into itself is called a contraction if there exists a positive 
real number r < 1 with the property that d(Tx,Ty) < rd(zx,y) for all 
points + and y in X. It is obvious that such a mapping is continuous. 
We shall need the following 


Lemma. If T is a contraction defined on a complete metric space X, then 
T has a unique fixed potnt. 
prooF. Let x» be an arbitrary point in X, and write x1 = Tx, 


te = T*x0 => Tx1, 
and, in general, tv, = Tmo = Tra. If m <n, then 


d(Xm,tn) = A(T™x0,T x) = A(T x0,T*T"- x0) 
< r™ d(x0,T*-"20) = 1” d(x0,Ln—m) 
< rm [d(xo,x1) + d(x1,2) + gre? + A(Lp—m—1;Tn—m)] 
Se d@genll:+ re ae] 
< | ae A (xo,21) 
1l-—r 


Since r < 1, it is evident from this that {z,} is a Cauchy sequence, and 
by the completeness of X, there exists a point x in X such that x, — z. 
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We now use the continuity of T to infer that x is a fixed point: 
Tx = T(lim za) = lim Tz, = lim 2,41 = 2. 


We conclude the proof by showing that x is the only fixed point. If y 
is also a fixed point, that is, if Ty = y, then d(x,y) = d(Tz,Ty) <r d(z,y); 
and since r < 1, this implies that d(z,y) = 0 or y = z. 


This result is the key to 


Picard's Theorem. If f(x,y) and Of/dy are continuous in a closed rec- 
tangle R = {(z,y):a1 < % <a, and bi < y < be}, and if (x0,yo) 78 an 
interior point of R, then the differential equation 


au = Hey) (a) 


has a unique solution y = g(x) which passes through (xo,yo). 
PROOF. Since f(z,y) and df/dy are continuous in R, they are bounded, 
and consequently there exist constants K and M such that 


If(zy)| < K (2) 
and | 5 f(x,y)| <M (3) 


for all points (z,y) in R. We next observe that if (z,y:) and (z,y2) are 
in R, then the mean value theorem guarantees that 


\f(x,y1) — f(z,y2)| = lyr — yal 5 fm yi + O(y2 — y1)) (4) 


for some @ such that 0 <6 <1. It now follows from (3) and (4) that 
\f(z,y1) — f(z,y2)| < Mlyr — yol (5) 


for all (x,y1) and (z,y2) in R.! 

It is convenient at this stage to replace our problem by an equivalent 
problem relating to an integral equation. If y = g(x) satisfies (1) and 
has the property that g(%o) = yo, then integrating (1) from 2» to x yields 


o(2) — g(xo) =f" fit,g(o) at 
or oa) = yo + fi f(t,g(t)) at. (6) 


Conversely, if y = g(x) satisfies (6), then it is clear that g(r) = yo, and 
on differentiating (6) we obtain (1). It therefore suffices to show that 
the integral equation (6) has a unique solution. 


1 The only use we make of the hypothesis that af/dy ex’sts and is continuous in 
R is to derive the so-called Lipschitz condition (5). 
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To accomplish this, we choose a positive number a such that Ma < 1 
and the closed rectangle R’ determined by |x — zo| < a and |y — yo| < Ka 
is contained in R. We now let X be the set of all continuous real func- 
tions y = g(z) defined on the closed interval |z — | <a such that 
lg(z) — yo| < Ka. X is clearly a closed subspace of the complete metric 
space @[z» — a, xo + a] and is therefore itself a complete metric space. 
Our next step is to consider the mapping T of X into itself defined by 
Tg = h, where 

h(x) = yo + fi fit.g(@)) at. 


The fact that 7 maps X into itself is evident from (2), for 
Jaa) — yal = | f* $9) at | < Ka. 
Furthermore, it follows from (5) that 


lha(z) — he(z)| = | J, f(x) — F(t,ga(t))] at 
< Ma sup |gi(x) — g2(z)]; 


and since Ma < 1, this shows that T is a contraction on X. We now 
appeal to our lemma to conclude that the equation Tg = g has a unique 
solution. Since this amounts to saying that the integral equation (6) 
has a unique solution, our proof is complete. 


The ideas in this proof have a much wider scope than might be 
suspected, and can be applied to establish many other existence theorems 
in the theory of differential and integral equations. 


APPENDIX TWO 


Continuous Curves and the 


Hahn-Mazurkiewicz Cheorem 


A continuous curve is usually thought of as “the path of a continu- 
ously moving point,” and this rather vague notion is often felt to carry 
with it the even vaguer attribute of “thinness,” or ‘“‘one-dimensionality.”’ 
For the case of plane curves, Jordan (in 1887) gave precise expression 
to this intuitive geometric concept by means of the following definition: 
if f is a continuous mapping of the closed unit interval J = [0,1] into the 


} 
| 
_—+— 
AW) f(1) fy(1) 


Fig. 40 


Euclidean plane R?2, then the subset f(Z) of R? is called a continuous curve. 
The fame of Jordan’s definition rests mainly on Peano’s discovery (in 
1890) of a continuous curve which passes through every point of a closed 
square. Curves of this type have come to be called space-filling curves. 

In Fig. 40, we show the first three stages in the construction of a 
particularly simple example known as Hilbert’s space-filling curve. If 


the square under consideration is S = {(z,y):0 < « < landO < y < 1}, 
341 
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then the respective curves are the images of J under continuous mappings 
fi, fe, and f; of J into S. The process of constructing these curves can 
be continued in the same way, and it yields a sequence of continuous 
mappings f, of J into S. By the manner in which each curve is con- 
structed from its predecessor, it is clear that the sequence {f,} converges 
pointwise to a mapping f of I into S; and since this convergence is evi- 
dently uniform, f is continuous (see Problem 14-4) and f(Z) is a continuous 
curve in the sense of Jordan. Furthermore, each point of S lies in f(I), 
so f(Z) is a space-filling curve. 

Peano’s discovery of space-filling curves was a shock to many mathe- 
maticians of the time, for it violated all their preconceived ideas of what 
a continuous curve ought to be. To a few of the others, however, it 
presented an opportunity. It suggested the very interesting problem 
of determining what a continuous curve actually is, or in other words, of 
finding intrinsic topological properties of a subset X of R? which are 
equivalent to the existence of a continuous mapping of J onto X. 

Before describing the solution of this problem, we place it in a wider 
context by extending Jordan’s definition. A topological space X is 
called a continuous curve if X is a Hausdorff space and there exists a 
continuous mapping of J onto X.1_ We know that J is compact and 
connected, so by Theorems 21-B and 31-B, any continuous curve is also 
compact and connected. In the lemmas below, we give two additional 
properties which every continuous curve must have. 

It is convenient to begin by introducing the following concept. A 
mapping f of one topological space into another is said to be closed if it 
carries closed sets into closed sets, that is, if f(F) is closed whenever F is 
closed. We shall use the fact that a continuous mapping of a compact 
space into a Hausdorff space is automatically closed (see the proof of 
Theorem 26-B). 


Lemma. Every continuous curve is second countable. 
pRooF. Let f be a continuous mapping of J onto a Hausdorff space X. 
We must show that X is second countable, that is, that it has a countable 
open base. J is a separable metric space, so it has a countable open 
base {B,}, and it is easily seen that the class {G;} of all finite unions of the 
B,’s is also a countable open base for J. Since J is compact and X is 
Hausdorff, f is closed, and therefore each set f(G;’) is closed. The class of 
all sets of the form {(G,’)’ is thus a countable class of open subsets of X, 
so it suffices to show that these sets constitute an open base for X. 

Let x be a point of X with neighborhood G. The set f-'({x}) is 
closed and is therefore a compact subspace of J with neighborhood f~'(G). 
For each point y in f-!({x}), there exists a set in {G,;} which contains y and 


4A continuous curve in the sense of this definition is often called a Peano space. 
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is contained in f-1(@)._ By the compactness of f-!({x}) and the fact that 
{G,} is closed under the formation of finite unions, there exists a G; such 
that 

f'(iz}) © G: Sf). 


On taking complements, we obtain 
fi(z})' 2G! Df)’, (1) 


and since the complement of an inverse image equals the inverse image of 
the complement, we can write (1) in the form 


F(z} 2G! 2DFIG@),. (2) 
If we now apply f to all members of (2), we get 


F(a} DG’) DAVE) 
or {z}’ Df(G/) 2G, 
so {z} C f(G/)’ CG, 


and the proof is complete. 


Lemma. Every continuous curve is locally connected. 

proor. Let f be a continuous mapping of J onto a Hausdorff space 
X. We must show that X is locally connected. By Problem 34-1, it 
suffices to show that if C is a component of an open subspace G of X, then 
C is open. 

Let A be a component of f-!(@)._ Then A is connected, and therefore 
f(A) is connected in G; and since C is a component of G, we see that f(A) is 
either disjoint from C or contained in C. It follows from this that f-'(C) 
is a union of components of f-1(G). Since f-'(G) is open and J is locally 
connected, Theorem 34-A tells us that the components of f-'(G) are open, 
so f-1(C) is open and f-1(C)’ = f-1(C’) is closed. We conclude the proof 
by observing that since the mapping f is closed, the set ff-'(C’) = C’ is 
closed, so C is open. 


The above remarks and lemmas establish the easy half of the follow- 
ing famous characterization of continuous curves. 


The Hahn-Mazurkiewicz Theorem. A topological space X is a continuous 
curve => X is a compact Hausdorff space which is second countable, connected, 
and locally connected. 


For the remainder of the proof, we refer the reader to Wilder [43, 
p. 76]. Additional discussions of a descriptive and historical nature can 
be found in Wilder [44] and Hahn [15]. 


APPENDIX THREE 


Boolean Algebras, 


Boolean Kings, and Stone’s Cheorem 


We saw in Sec. 2 that a Boolean algebra of sets can be defined as a 
class of subsets of a non-empty set which is closed under the formation of 
finite unions, finite intersections, and complements. Our purpose in 
this appendix is threefold: to define abstract Boolean algebras by means 
of lattices; to show that the theory of these systems can be regarded as 
part of the general theory of rings; and to prove the famous theorem of 
Stone, which asserts that every Boolean algebra is isomorphic to a Boolean 
algebra of sets. 

The reader will recall that a lattice is a partially ordered set in which 
each pair of elements x and y has a greatest lower bound z a y and a least 
upper bound z v y, and that these elements are uniquely determined by 
zand y. It is easy to show (see Problem 8-5) that the operations « and 
v have the following properties: 


ZAL=2 and ZVL=2; (1) 

LAY =YAL and ZVY=YVA; (2) 
za(yaz) = (ray)az and zv(yvz) =(rvy)vz; (8) 
(ray)vz=2 and (zvy)Ax =a. (4 


We shall see in the next paragraph that these properties are actually 
characteristic of lattices. Before proceeding further, however, we remark 


that 
rlySray=-z. 


This fact serves to motivate the following discussion. 

Let L be a non-empty set in which two operations and v are defined, 
and assume that these operations satisfy the above conditions. We 
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shall prove that a partial order relation < can be defined in L in such a 
way that L becomes a lattice in which xa y and zvy are the greatest 
lower bound and least upper bound of z and y. Our first step is to notice 
that cay =z and xvy =y are equivalent; for if cay =z, then 
gvy = (cay)vy = yar)vy =y, and similarly vy = y implies 
xay =x. We now define x < y to mean that either ray = 2 or 
zvy=y. Since raz = x, we have x < x forevery x. If x < y and 
y <2, 80 that cay=2 and yaxr=y, then c= TAY = Yar = y. 
Ifx < yand y < z, so thatzay = cand yaz = y, then 


zraz=(xay)az=axal(yaz) =Lay =z, 


so x <z. This completes the proof that < is a partial order relation. 
We now show that za y is the greatest lower bound of x and y. Since 
(cay)ve=aand (ray)vy = (yazr)vy = y, we see that ray <x 
andzay<y. Ife <azandz < y,sothatznaxr =zandzay = 2, then 
2a(ray) = az)ay=zay =2,80z2<azay. It is easy to prove, 
by similar arguments, that x v y is the least upper bound of z and y. 

This characterization of lattices brings the theory of these systems 
somewhat closer to ordinary abstract algebra, in which operations 
(instead of relations) are usually placed in the foreground. 

A lattice is said to be distributive if it has the following properties: 


za(yv2z) = (ray) v (raz) (5) 
and av (yaz) = (rvy)a(zv2). (8) 


It is useful to know that (5) and (6) are equivalent to one another. For 
if (5) holds, then 


(zvy)a(avz) =[@vy)azlvi(avy) rz 
avi[(avy) az 
av[(raz)v (yaz)] 

[xv (2 Az)] Vv (y A2) 

= zv (yz), 


Nl 


and a similar computation shows that (6) implies (5). We shall say that a 
lattice is complemented if it contains distinct elements 0 and 1 such that 


0<z<1 (7) 


for every x (these elements are clearly unique when they exist), and if 
each element x has a complement x’ with the property that 


zaz'=0 and ava’ =1. (8) 


We now define a Boolean algebra to be a complemented distributive lattice. 
It is quite possible for an element of a complemented lattice to have 
many different complements. In a Boolean algebra, however, each 
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element has only one complement. To prove this, we suppose that z* is 
also an element with the property that raz* = 0 and ev2* = 1. 
Then 
z*=axtal = ata (evez’) = (x* az) v (e* az’) 
=Ov(etaz’) = tae’, 
s0 2* < x’. If we now reverse the roles of x’ and x*, we obtain x’ < z*, 


soz* = x’. In the light of this result, it is evident from (8) that z is the 
complement of 2’: 


aw’ = 2. (9) 
Furthermore, it follows from (7) that0 a1 = Oand0v1 = 1, 80 we have 
0’ =1 and V=0. (10) 

The identities 
(cay) =a2' vy’ and (evy)’ =a’ ay’ (11) 


are also true in every Boolean algebra. We shall prove the first part of 
(11). Our principal ‘tool will be the fact that 


e<yey’ <2’. (12) 


To establish (12), it suffices to show that x < y implies y’ < 2’, and the 
proof of this is easy: if x < y, thenzay’ < yay’ = 0,80 


y=aHyyala=yaCve)=(y'as)v(yar’) =Ov(y ar) =yaz, 


and therefore y’ < z’. We now turn to the proof of (ray)! = 2’ vy’. 
Our first step is to observe that if x’ < z and y’ < z, so that 2’ < x and 
2’ <y, then 2’ < ray or (zay)’ <z. This shows that (xa y)’ is less 
than or equal to any upper bound of 2’ and y’, so (x ay)’ <2’ vy’. We 
conclude the proof by showing that 2’ vy’ < (rxay)’. This follows at 
once from the relations x’ < (x ay)’ and y’ < (xa y)’, which, since they 
are equivalent to ray <x and ray < y, are evidently true. The 
second part of (11) can be proved in essentially the same way. 

One of the basic facts about Boolean algebras is that these systems 
can be identified with a certain class of rings. This enables us to study 
Boolean algebras by means of powerful techniques which are already 
available in the general theory of rings. 

A Boolean ring is a ring with identity in which every element is 
idempotent (i.e., x2 = x for every x). It is a surprising fact that multi- 
plication in a Boolean ring is automatically commutative and that 
x +2 =0 (or equivalently, s = —zx) for every x. The proof of these 
statements rests on the relation 


ety=(@t+yt=e@tyety=C+rytyty 


etry tyrt+y, 
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which implies that ry + yx = 0, so zy = —yz. If we put y = z, this 
yields x? = —z?, so x = —z; and from this we obtain zy = —yzx = yz. 

In order to make a Boolean algebra A into a Boolean ring R, we 
define addition and multiplication by 


zrty=(zay)v(z' ay) and LY = LAY. (13) 


(For the motivation behind these definitions, see Example 40-3.) To 
verify that R actually is a Boolean ring, we proceed as follows. It is clear 
that 
tty =(ay)v@ay) = (y’az)v yar’) 
= yar')v(y'az) 


=yt2, 
that 
z+0 = (ra 0’) v (2 a0) = (ealvO 
=raAl=z, 
and that 


erta2= (rzaz')v(z’axz) =0vV0=0. 


The proof that addition is associative is more complicated. It is con- 
venient to begin with the observation that 


(x + y)’ 


(way) v (aay)! = @’vy)a(rvy’) 
[(2’ vy) aa}v[(2’ vy) ay’) 

= [(2’ az) v(yaz)]v[(z’ ay’) v(yay’)] 
= (ray)v(z' ay’). 


Now, using this, we have 
at (yy +z) = [eaty +2)']v [ea (y + 2)] 


= [za (Yyaz)v (y’az’))] v [za ((y az’) v (y’ a2z))] 
=(rayrz)v(cay az)v(e’ayaz’)v(r' ay’ aa). 


It is clear by inspection that the expression last written is unaltered by 
interchanging z and z, so r+ (y +z) =z+ (y+ 2); and since, by 
commutativity, we have z+ (y+2) = (x+y) +2, it follows that 
addition is associative. The relevant properties of multiplication are 
fairly easy to establish. It is immediate from the definition that 


x(yz) = (xy)z, 


that x? = z, and that 1 is an identity. In view of the fact that multi- 
plication is obviously commutative, all that remains is to verify that 
x(y + 2) = zy + zz, and this is a consequence of the following com- 
putations: 


zy +z) =zra(ytz2) 


zal(y nz’) v (y' az)] 
(cayrn2')v(zay’ a2), 
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and 
zy + xe = (cay) + (raz) 
= [(zay) a (eaz)']v[(eay)’a(eaz)] 
= [(ray) a @'vz’)] v[(@' vy’) a(eaz)] 
= (rayrnn')v (cr ayaz2’)v(e acanz)v(y aAxcaz) 
= (rayan2')v (cay v2). 


Thus # is a Boolean ring. 
We now reverse this process; that is, we start with a Boolean ring R, 
and we show that the definitions 


xAY = ry and avy=atytoay (14) 


convert it into a Boolean algebra A. If we keep in mind the fact that 
multiplication in R is commutative and that for every z we have x? = x 
and x = —z, then (1) and (2) are evident. Property (3) follows from 
the associativity of multiplication and the computations 


zv(yvz) =av(yt2z+ yz) 
=atyt2t ye + zy + xz 4+ ryz 

and (avy)vz2= (ea ty+ czy) vz 
ety tary t+2+acz2+ yz + xyz. 


Property (4) is also true, for 


(way)ve=ayve=2zytaetaryzr=rt+oytay=2 
and 
(evyazr=(@tytayarH=etyzrtayx =xetoy try =z. 


These remarks show that A is a lattice. Further, this lattice is distribu- 
tive, for 


tt 


zatyvz) =2(y +24 yz) = cy + 22 4+ xyz 
xy + xz + xyxz 
xy V vz 


(cay) v (eAz2). 


It is easy to see that the elements 0 and 1 have the property that 0 < 
x < 1 for every z, and that x’ = 1 + zactsasa complement for z, so A is 
a Boolean algebra. 

It is worth noting that the two processes we have described are 
inverses of one another. Suppose we start with a Boolean algebra A and 
use (13) to make it into a Boolean ring R: 


xty=(ray')v(a'ay) and ry = XAY. 
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Next, we use (14) to convert R back into a Boolean algebra A: 
ukKy = xy and avy=atyt cy. 


It is apparent that 2A y = zy = cay; and since 


l+a=(laz’)v (az) = (Laz) v Oaz) 
=2’v0=7', 
we also have 
aVy=atytray=14+(14+2)1+y) 
= 1+ 2'y’ 
= (c’ ay’) 
=2Zvy. 


This shows that the operations in A coincide with those in A. Con- 
versely, if we start with a Boolean ring R, make it into a Boolean algebra 
A, and then convert A back into a Boolean ring R, then the operations in 
R coincide with those in R. We leave the verification of this to the 
reader. 

The ideas developed above show that Boolean algebras are essen- 
tially identical with Boolean rings. The practical effect of this is a 
considerable saving of labor, for it allows us to transpose our study of 
Boolean algebras into the more familiar context of the theory of rings, 
where many standard tools—ideals, homomorphisms, etc.—lie ready at 
hand. We illustrate this principle by proving the basic representation 
theorem for Boolean algebras in two steps: first, we prove the corre- 
sponding theorem for Boolean rings; and second, we translate this result 
back into the language of Boolean algebras. 

Before entering into the details, we give a brief description of the type 
of representation we are aiming at. If X is a compact Hausdorff space, 
then each of the sets § and X is both open and closed (or more briefly, 
open-closed), and the class A of all such sets is a Boolean algebra of subsets 
of X. If X is disconnected, then A contains at least three sets; and if X is 
totally disconnected, then A may contain a great many sets, for, by 
Theorem 33-C, it is an open base for the topology of X. Furthermore, 
we know that A becomes a Boolean ring of sets if addition and multi- 
plication are defined by 


A+B=(ANB)U(A'NB) and AB=ANB. 


Our basic representation theorem states that every Boolean algebra 
(Boolean ring) is isomorphic to the Boolean algebra (Boolean ring) of all 
open-closed subsets of some totally disconnected compact Hausdorff 
space. 

Now for the details. The simplest of all Boolean rings is the ring 
{0,1} of integers mod 2, and this ring is evidently a field. Conversely, 
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any Boolean ring which is a field necessarily equals {0,1}. To see this, it 
suffices to observe that if x is a non-zero element in such a ring, then 


1 =¢e3 =e = err) = zl = 2. 


If J is a proper ideal in a Boolean ring R, then the quotient ring R/TJ is 
also a Boolean ring; for R/J clearly has an identity, and 


(@+tD?=2+i=24+1 


forevery xin R. Thus, by Theorem 41-C, R/I = {0,1} <= J is maximal. 
Since every homomorphism of # arises from an ideal in R, this tells us 
that the homomorphisms of R onto {0,1} are precisely those of the form 
R— R/M, where M is a maximal ideal in R. A standard application of 
Zorn’s lemma shows that R has maximal ideals, so there do exist homo- 
morphisms of R onto {0,1}. We shall need the following stronger 
statement. 


Lemma. Jf x ts a non-zero element in a Boolean ring R, then there exists 
a homomorphism h of R onto {0,1} such that h(x) = 1. 

PROOF. By the above remarks, it suffices to show that there exists a 
maximal ideal in R which does not contain z. Since x ~ 0, there clearly 
exists at least one ideal which does not contain x. If Zorn’s lemma is 
applied to the set of all ideals which do not contain z, we obtain an ideal 
M which is maximal with respect to the property of not containing z. 
We conclude the proof by showing that M actually is a maximal ideal. 
To prove this, it suffices to show that M contains 1 + zx (for it will then 
follow that any strictly larger ideal contains both z and 1 + 2, and so 
contains 1). We therefore assume that M does not contain 1 + z, and we 
deduce a contradiction from this assumption. It is clear that 


I= {m+r(1+2):meéM andre R} 


is the smallest ideal containing both M and 1 + z, so J properly contains 

M. However, J does not contain 2; for if it did, we would have 
m+trit+ta2=2 

for some m and 7, and this implies that 

[m + r(1 + 2)]x 

= mz + r(x + 2?) 

mz + r(x + 2) 


= mz, 


z= x? 


contrary to the fact that xis notin M. This contradicts the maximality 
property of M, and the proof is complete. 


We are now in a position to prove our principal theorem. 
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The Stone Representation Theorem. Jf R is a Boolean ring, then there 
exists a totally disconnected compact Hausdorff space H such that R is 
isomorphic to the Boolean ring of all open-closed subsets of H. 

proor. Let H* be the set of all mappings of R into the Boolean 
ring {0,1} If for each x in R we define H. by H, = {0,1}, then H* isthe 
product set P,.rH,. We now impose the discrete topology on each H,, 
and thus convert it into a totally disconnected compact Hausdorff space. 
This permits us to regard H* as a product space, and it is also a totally 
disconnected compact Hausdorff space. For use in the next paragraph, 
we note that if x is any given element of R, then each of the sets 


{f:f(@) = 0} 


and {f:f(z) = 1} is open-closed. This follows at once from the fact that 
each is the inverse image of an open-closed set in H, under the projection 
of H* onto H,. 

We now pass to the subspace H of H* which consists of all homo- 
morphisms of R onto {0,1}. It is clear that H is a totally disconnected 
Hausdorff space. To prove that it is also compact, it suffices to show that 
it is closed in H*, and this we do as follows. A homomorphism of F onto 
{0,1} is of course a mapping f in H* such that f(x + y) = f(z) + fly) 
and f(xy) = f(x)f(y) for all x and y and such that f(1) = 1. It is evident 
from this that H is the intersection of the following three subsets of H*: 


N\zawrif:f(z + y) = f(z) + fy}, (15) 
Nanri f:f(zy) = f@)fy)}, (16) 
and {f:f(Q) = 1}. (17) 


We know from our remark in the preceding paragraph that (17) is closed; 
and if we can show that the other two sets are also closed, then it will 
follow at once that H is closed. We inspect (15). If x and y are any 
given elements of R, then it is easy to see that 


(f:f@ + y) = fe) +f} 


is the union of the following four sets: 


{f:f(x) = 0, f(y) = 0, and f(z + y) = 0}, 
{f:f(z) = 0, f(y) = 1, and f(z + y) = Wj, 
{f:f(z) = 1, f(y) = 0, and f(z + y) = 1, 
and {f:f(z) = 1, f(y) = 1, and f(z + y) = 0}. 


Each of these sets, being itself the intersection of three closed sets, is 
closed, so {f:f(x + y) = f(z) +f(y)} is closed, and consequently (15) 
is also closed. A similar argument shows that (16) is closed, so H is 
closed and therefore compact. 
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Our next step is to exhibit an isomorphism T of R into the Boolean 
ring R of all open-closed subsets of H. We define T by 


T(x) = {f:f¢H and f(z) = 1}. 


It is clear that T maps R into R. T is also a homomorphism, for 


{f:f@Ffy) = 1} 
{f:f@) =U OtFf@) = 
T(x)T(y). 


Our lemma tells us that T(x) is non-empty whenever z ~ 0, so T is an 
isomorphism of R into R. It will be useful in the next paragraph if we 
also note here that 


Tie +y) = (f:fa@+y) =1} 
= {f:f@) +fy) =1} 
= {f:f@ =1) + (f:fy% = 
= T(z) + Ty) 
and T(zy) = {f:f(zy) = 1} 


T(1) = {f:f(1) = 1} = A, 
for it follows from this that 
Tai+2) = 71) +7) = H+ T(z) = TC)’ 


for every x in R. 
Finally, we show that T maps R onto R. We begin by observing 
that the topology of H is defined by means of basic open sets of the form 


B= (f:f@) =«,t=1,...,n}, 


where {21, . . . , 2n} is an arbitrary finite subset of R and each ¢; equals 
OQ or 1. These sets are evidently closed as well as open. Furthermore, 
every set of this kind is in the range of 7’; for since 


{f:f(z) = 0} = {f:fa +2) = 1}, 
if we define y; to be x; or 1 + 2; according as e; equals 1 or 0, then 


B= O21 {f:f(@) = «} 
= Via {f:fy) = 13 
YViu1 T (ys) 
Ty + + * Yn). 
We now consider an arbitrary open-closed set Sin R. Since S is compact 


and the B’s constitute an open base, S is the union of a finite number 
of B’s, say Bi, . . . , Bm; and by the above result, each B; is expressible 


Boolean Algebras, Boolean Rings, and Stone’s Theorem 353 


in the form B; = T(z;) for some element z; in R. It now follows that 
S = Uh By = (Of BY = (O81 T)’)’ 

(Nx TUL + 2)’ 

(T([t + ai} - + + [1 + 2mJ))’ 

Tl + [1 +2) +--+ (1 +2,)). 


This shows that T is an isomorphism of F onto R, so the proof is complete. 


We now conclude our theory by translating Stone’s theorem into the 
language of Boolean algebras. 

Let A and A* be Boolean algebras. A mapping h of A into A* is 
called an isomorphism (or a Boolean algebra isomorphism) if it is one-to-one 
and has the following three properties: h(x ay) = h(x) ah(y), 


h(xv y) = h(x) vhty), 


and h(x’) = h(x)’. A is said to be isomorphic to A* if there exists an 
isomorphism of A onto A*. If A and A* are converted into Boolean 
rings R and R*, then it is easy to show that every Boolean algebra 
isomorphism of A onto A* is a Boolean ring isomorphism of R onto R*, 
and conversely. We leave the details to the reader. 

These ideas make it possible for us to state the following equivalent 
form of Stone’s theorem: If A ts a Boolean algebra, then there exists a 
totally disconnected compact Hausdorff space H such that A is isomorphic to 
the Boolean algebra of all open-closed subsets of H. 
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Judex of Symbols 


a = b (mod m) 


Equality and inequality for sets, 5 

Set inclusion, 5 

Proper set inclusion, 6 

Union of two sets, 8 

Intersection of two sets, 9 

Complement of a set, 10 

Difference of two sets, 13 

Symmetric difference of two sets, 13 

Congruence modulo m for integers, 30 

Intervals on the real line, 5, 57 

Cardinal number of a countably infinite set, 34 

Closure of a set, 68, 96 

Order relation for self-adjoint operators, 268 

Total matrix algebra of degree n, 284 

Function algebra representing a commutative 
Banach algebra, 319 


Space of bounded (or continuous) linear trans- 
formations of N into N’, 221 

Algebra of operators on N, 222 

Stone-Cech compactification of X, 139, 141 


Cardinal number of the continuum, 39 

Complex number system, 23, 52-54, 214 

n-dimensional unitary space, 23-24, 89-90, 214 

Infinite-dimensional unitary space, 90 

Extended complex plane, 162-163 

Space of bounded continuous real functions on 
[0,1], 56 

Space of bounded continuous real functions on 
[a,b], 84 

Space of bounded continuous real functions on X, 
82, 106 
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360 Index of Symbols 


e(X,C) 
C(X,R), C0(X,C) 


e€(X) 


6; 
det ([as]) 


f2a:YoXx 
f(A) 
f-(B) 
f=g9 

F, 
fu 


G 
gf:X > Z 


Int (A) 
ix 

=> = 
();Ay, ete. 
—o, +o 


inf A 


Space of bounded continuous complex functions 
on X, 84, 106 

Spaces of continuous functions on X which vanish 
at infinity, 165 

Space of bounded continuous scalar-valued func- 
tions on X, 216 


Distance from one point to another, 51 
Distance from a point to a set, 58 
Diameter of a set, 58 

Derived set, 96 

Kronecker delta, 283 

Determinant of a matrix, 287 


Empty set, 5 


Function (or mapping) with domain X and range 
in Y, 16 

Inverse function (or mapping), 17 

Image of a set under a mapping, 18 

Inverse image of a set under a mapping, 18 

Equality for mappings, 20 

Induced functional on a conjugate space, 231 

Multiplicative functional induced by a maximal 
ideal, 321 


Group of regular elements in a Banach algebra, 305 
Product of two mappings f:X — Y andg:Y — Z, 
19 


Interior of a set, 63, 97 

Identity mapping on a set, 20 

Implication and logical equivalence, 6 

Intersection of a class of sets, 11 

Infinity (minus and plus), 56 

Infimum (or greatest lower bound) of a set of real 
numbers, 45, 57 

Integers modulo m, 182 

Ideal associated with a closed set, 329 


Limit of a sequence, 50, 71 
Banach space of n-tuples, 214 
Banach space of sequences, 215 


Ly 
ha 
lo, C, Co 


L/M 
1i(G) 


I 
min A, max A 


M:z 
M+WN 
MON 


m<nmeon 


N* 
N** 


N/M 


o(T) 
a(x) » 7A (x) 


Index of Symbols 361 


Banach space of measurable functions, 215 

Banach space of n-tuples, 216 

Banach spaces of sequences, 216 

Quotient space of a linear space with respect to a 
subspace, 193-194 

Group algebra of a finite or discrete group, 303-305 


Space of maximal ideals, 319 

Minimum and maximum of a finite set of real 
numbers, 45 

Maximal ideal associated with a point, 327 

Sum of two subspaces of a linear space, 195 

Direct sum of two subspaces of a linear space, 195 

Order relation for cardinal numbers, 35, 48 


Conjugate space of a normed linear space, 224 

Second conjugate space of a normed linear space, 
231 

Quotient space of a normed linear space with 
respect to a closed subspace, 213 


Product of a class of sets, 25 
Projection of a product onto a coordinate set, 25 


Real number system, 21, 52, 214 

Coordinate plane, 22 

n-dimensional Euclidean space, 23-24, 85-89, 214 
Infinite-dimensional Euclidean space, 90 

Spectral radius of z, 310 

Resolvent set of x, 309 

Quotient ring of a ring with respect to an ideal, 187 


Set of singular elements in a Banach algebra, 305 

Closed unit sphere in a conjugate space, 233-234 

Orthogonal complement, 249 

Subspace spanned by S, 194 

Open sphere with radius r and center xo, 59 

Closed sphere with radius r and center xo, 66 

Supremum (or least upper bound) of a set of real 
numbers, 45, 56 

Spectrum of an operator, 289, 296 

Spectrum of an element in a Banach algebra, 
308 


362 Index of Symbols 


T* 
T* 
ITI 
(T], [Tle 


U 
UA, ete. 


Conjugate of an operator, 241 
Adjoint of an operator, 263 
Norm of an operator, 220-221 
Matrix of an operator, 281 


Universal set, 5 
Union of a class of sets, 11 


Mapping notation, 17 

Convergent sequence, 50, 70-71, 132 

One-point compactification of X, 163 

Product of two sets, 23 

Equivalence set associated with z, 27 

Norm of 2, 54, 81, 212 

Norm of coset « + M, 213 

Function on maximal ideals, 318-319 

Function on maximal ideals, 319 

Resolvent of z, 309 

Meet and join of z and y, 46 

Inner product of x and y, 245 

x is (is not) an element of A, 5 

x is orthogonal to y, 249 

x is equivalent to y, 27 

Order relation for real numbers and partial 
ordered sets, 7, 43 

Congruence modulo an ideal in a ring, 186 

Congruence modulo a subspace in a linear spac 
193 

Adjoint of z, 324 


Set of topological divisors of zero in a Banac 
algebra, 307 


Subject Judex 


Absolute value, on complex plane, 53 
of a function, 159 
on real line, 52 
Achieser, N.I., 157 
Adjoint, of element in Banach *-algebra, 
324 
of operator, 263 
Adjoint operation (involution), on @(H), 
265 
on Banach *-algebra, 324 
Alexandroff, P., 130n. 
Algebra, 106, 208 
B*-, 324 
Banach, 302 
Banach *-, 324 
Boolean (see Boolean algebra) 
center, 210 
commutative, 106 
complex, 106 
C*., 303 
disc, 303 
division, 208 
group, 303-305 
homomorphism, 210 
ideal in, 209, 313 
with identity, 106 
isomorphism, 210 
quotient algebra of, 209 
radical, 314 
regular representation, 210 
semi-simple, 316 
subalgebra of, 106, 208 
total matrix, 284 
von Neumann, 303 
W*., 303 
Antisymmetry, 43 
Arsela’s theorem, 128 


Ascoli’s theorem, 126, 128 
Axiom of choice, 46 


*-algebra, 324 
representation, 325-326 
Baire’s theorem, 74, 75n. 
Banach algebra, 302 
Banach subalgebra of, 302 
representation, 305 
Banach *-algebra, 324 
*-isomorphism, 324 
Banach space, 82, 212 
closed unit sphere, 217, 232 
representation, 234 
uniformly convex, 248 
Banach-Steinhaus theorem, 240 
Banach-Stone theorem, 330 
Bar-Hillel, Y., 7, 46x. 
Base, closed, 112 
generated by subbase, 101, 112 
open, 99 
Basis, 197 
orthonormal, 293 
Bell, E. T., 37 
Bernstein polynomials, 154 
Bers, L., 338 
Bessel’s inequality, 252-253, 257 
Birkhoff, G., 29, 46n., 47 
Birkhoff, G. D., 338 
Bolzano- Weierstrass property, 121 
Bolzano-Weierstrass theorem, 121 
Boolean algebra, 345 
as Boolean ring, 347-349 
isomorphism, 353 
representation, 353 
of sets, 12, 344 
363 


364 Subject Index 


Boolean ring, 346 
as Boolean algebra, 348-349 
as field, 349-350 
maximal ideals in, 350 
representation, 351 
semi-simplicity, 350 
Boundary, 68, 97 
Boundary point, 68, 97 
Bounded function, 55 
Bounded linear transformation, 220 
Bounded mapping, 58 
Bounded set, 58 
Brouwer’s fixed point theorem, 338 


C*-algebra, 303 
commutative, 332-334 


Canonical form problem for matrices, 286 


Cantor, G., 31-43, 49 
Cantor continuum hypothesis, 39 
Cantor intersection theorem, 73 
Cantor set, 67 
Cardinal number(s), 31 
comparability theorem, 48 
of continuum, 39 
finite, 32 
Cartesian product (product of sets), 
23-25 
Cauchy sequence, 71 
Cauchy’s inequality, 88, 219 
Cayley’s theorem, 181 
Chain, 44 
Characteristic equation, 288-289 
Characteristic value, 278n. 
Characteristic vector, 278n. 
Choice, axiom of, 46 
Class, 4 
disjoint, 9 
Closed base, 112 
generated by closed subbase, 112 
Closed graph theorem, 238 
Closed mapping, 342 
Closed rectangle, 101, 119 
Closed set, 65, 95 
Closed sphere, 66 
Closed strips, 101 
Closed subbase, 112 
Closed unit sphere, 217, 232 
Closure, 68, 96 
Compact subspace, 111 
Compact topological space, 110, 111 


Compactification, one-point, 163 
Stone-Cech, 141, 331 
Comparability theorem for cardinal 
numbers, 48 
Comparable elements, 7, 44 
Complete metric space, 71 
Complete orthonormal set, 255 
Completely regular space, 133 
Completion of metric space, 84-85 
Complex plane, 23, 52-54 
extended, 162 
Component of a space, 146 
Congruent modulo, an ideal, 186 
a linear subspace, 193 
& positive integer, 30 
Conjugate, of a function, 108, 161 
of operator, 241 
Conjugate space, 224 
Connected space, 142, 143 
Connected subspace, 143 
Continuous curve, 341-342 
Continuous function, 50 
Continuous image, 93 
Continuous linear functional, 224 


Continuous linear transformation, 219- 


220 
Continuous mapping, 76, 93 
jointly, 118 
at point, 75-76, 104 
in single variable, 118 
Continuum, cardinal number, 39 
hypothesis, 39 
Contraction, 338 


Convergence of functions, pointwise, 83 


uniform, 83 


Convergent sequence, limit, 50, 71, 132 


of numbers, 50 

in a space, 70, 132 
Convex set, 148 
Convolution, 304-305 
Coordinate plane, 22 
Cosets, 186-187, 193-194 
Countably compact space, 114 
Courant, R., 94, 338 
Curve, continuous, 341-342 


Dense (everywhere dense) set, 70, 96 
Derived set, 96 
Determinant, of matrix, 287 

of operator, 288 


Dimension, of a linear space, 200 
orthogonal, of a Hilbert space, 259-260 

Disc algebra, 303 

Disconnected space, 143 

Disconnection of a space, 143 

Discrete space, 93 

Discrete topology, 93 

Discrete two-point space, 144 

Disjoint linear subspaces, 195 

Disjoint sets, 9 

Distance, from point to set, 58 
between two points, 51 

Distributive laws, for lattices, 345 
for sets, 10 

Division algebra, 208 

Divisor of zero, 183 
topological, 307 

Dixmier, J., 303n. 

Dunford, N., ix, 226, 232n. 


Eigenspace, 278 
Eigenvalue, 278 
Eigenvector, 278 
Element(s), 3 
comparable, 7, 44 
maximal, 44 
in ring, regular, 183, 314 
singular, 183, 314 
Empty set, 5 
e-net, 123 
Equicontinuous functions, 126 
Equivalence relation, 27 
Equivalence set, 27 
Euclidean plane, 22, 87-88 
Euclidean space, infinite-dimensional, 90 
n-dimensional, 24, 87, 214 
Everywhere dense set, 70, 96 
Extended complex plane, 162 
Extended real number system, 56 


Family, 4 

Field, 184 

Finite intersection property, 47, 112 

First countable space, 100n. 

Fixed point, 338 

Fixed point space, 337-338 

Fixed point theorem, Brouwer’s, 338 
Schauder’s, 338 

Fomin, S. V., 128, 215n. 

Fourier coefficients, 256, 257 


Subject Index 365 


Fourier expansion, 256, 257 
Fraenkel, A. A., 7, 42n, 46n. 
Full linear group, 207 
Function(s), absolute value, 159 
bounded, 55 
complex, 17 
conjugate, 108, 161 
constant, 16 
continuous, 50 
at point, 50 
in contrast to mapping, 17 
convergence, pointwise, 83 
uniform, 83 
definition, 16 
domain, 15, 16 
equicontinuous, 126 
extension, 17 
generalities, 14-16 
imaginary part, 161 
moments, 157 
range, 15, 16 
real, 17 
real part, 161 
restriction, 17 
uniformly bounded, 128n. 
vanishing at infinity, 165 
(See also Mapping) 
Function spaces, 82 
Functional(s), 224 
extension, 226-228 
on Hilbert space, representation, 261 
induced, 231 
multiplicative, 321 
Fundamental theorem of algebra, 245, 
289, 310 


Gal, 1. S., 240 
Galileo, 33 
Gelfand mapping, 319 
Gelfand representation theorem, 322 
Gelfand-Neumark theorems, 318, 325- 
326 
Gédel, K., 39n. 
Goffman, C., 128 
Goldberg, R. R., 305n. 
Gram-Schmidt process, 258, 295 
Graph of mapping, 23 
Group, 172-173 
Abelian (commutative), 173, 179 
additive, 179 


366 Subject Index 


Group, abstract vs. concrete, 173n. 
center of, 180 
circle, 174 
finite, 173 
full linear, 207 
homomorphism, 180 
identity in, 173, 179 
infinite, 173 
inverses, 173, 179 
isomorphic, 180 
isomorphism, 180 
order, 173 
permutation, 178 
regular representation, 181 
subgroup of, 178 
symmetric, 176 
of symmetries of square, 176-177 
transformation, 178, 181 
Group algebra, 303-305 


Hahn, H., 343 

Hahn-Banach theorem, 211, 228 
generalized form, 230-231 

Hahn-Mazurkiewicz theorem, 343 


Halmos, P. R., 42n., 46n., 172, 215n. 


Hausdorff space, 130 
Heine-Borel theorem, 110, 114 
converse, 115 
generalized, 119 
Hermite functions, 259 
Hewitt, E., 121n. 
Hilbert, D., 49 
Hilbert cube, 248 
Hilbert space(s), 245 


among complex Banach spaces, 248 


inner product, 245 

orthogonal complements, 249 

orthogonal dimension, 259-260 

orthogonal subspaces, 250 

orthogonal vectors, 249 

orthonormal set, 251 
complete, 255 

parallelogram law, 247 

Pythagorean theorem, 249 

representation, 260 


representation of functionals on, 261 
Hilbert’s space-filling curve, 341-342 


Hille, E., ix, 232n. 
Hélder’s inequality, 218 
general form, 219 


Hélder’s inequality, in relation to 
Cauchy’s, 219 

Homeomorphic image, 94 

Homeomorphic spaces, 94 

Homeomorphism, 93 

Hopf, H., 130n. 

Hurewicz, W., 150 


Ideal, in algebra, 209, 313 
contrasted with ring ideal, 209 
maximal, 314 

in ring, 184, 185 
general significance, 188-190 
maximal, 190 
Image, continuous, 93 
homeomorphie, 94 
Induced functionals, 231 
Inequality, Bessel’s, 252-253, 257 
Cauchy’s, 88, 219 
Hdlder’s, 218, 219 
Minkowski’s, 88, 90, 218, 219 
Schwartz’s, 246 
triangle, 51 
Infimum, 45 


Infinite-dimensional Euclidean space, 90 
Infinite-dimensional unitary space, 90 


Inner product, 245 
Interior, 63, 97 

Interior point, 63, 97 
Intervals, 5, 57 

Involution, 324 

Isolated point, 96 
Isometric isomorphism, 222 
Isometry, 79 


Join, 46 
Joint continuity, 118 
Jordan, C., 341-342 


Kadison, R. V., 269 
Kakutani, S., 265n. 

Kamke, E., 42n. 

Kelley, J. L., 1397. 

Kellogg, O. D., 338 
Kolmogorov, A. N., 128, 215n. 
Kronecker delta, 283 
Kuratowski closure axioms, 98 


Laguerre functions, 259 
Lattice, 46, 344 
characterization, 344-345 
complemented, 345 
complete, 47 
distributive, 345 
sublattice of, 47 
Laurent expansion, 313 
Least upper bound property, 21, 
45 
Lebesgue, H., 49 
Lebesgue covering lemma, 122 
Lebesgue number, 122 
Legendre polynomials, 259 
Limit, in the mean, 257 
of sequence, 50, 70-71, 132 
Limit point, 65, 96 
contrasted with limit, 72 
Lindeléf’s theorem, 100 
Linear space, 81, 191 
basis for, 197 
dimension, 200 
isomorphism, 200 
linear combination in, 194 
linear dependence in, 196-197 
linear independence in, 196-197 
linear operations in, 191 
linear subspace(s), 81, 193 
disjoint, 195 
sum of, 195 
direct, 195 
normed, 54, 81, 212 
quotient space, 193-194 
representation, 201-202 
Linear transformation(s), 203 
bound for, 220 
bounded, 220 
continuous, 219, 220 
idempotent, 206 
identity, 205 
inverse, 205 
negative, 204 
non-singular, 205 
norm, 220-221 
null space, 207 
nullity, 207-208 
product, 204 
range, 207 
rank, 208 
scalar multiple, 204 
zero, 204 


Subject Index 367 


Linearly ordered set, 44 
Liouville’s theorem, 309-310 
Lipschitz condition, 339n. 
Locally compact space, 120, 162 
Locally connected space, 151 
Loomis, L. H., ix, 215n., 305n. 
Lorch, E. R., 296n. 
Lorentz, G. G., 157 
Lower bound, 44 

greatest, 44-45 
DL, space, 215 


McCoy, N. H., 172 

Mackey, G. W., 265n., 305n. 

MacLane, S., 29 

Mapping(s), 16 
bounded, 58 
closed, 342 
composition (multiplication) of, 19 
continuous, 76, 93 

jointly, 118 
at point, 75-76, 104 
in single variable, 118 
uniformly, 77 
contrasted with function, 17 
equality for, 20 
Gelfand, 319 
graph, 23 
identity, 20 
into, 17 
inverse, 17 
isometric, 79 
one-to-one, 17 
onto, 17 
open, 93 
product, 19 
of sets, 18-19 
(See also Function) 

Matrices, canonical form problem, 286 
operations for, 282-283 
similar, 286 

Matrix, conjugate transpose, 294 
determinant, 287 
diagonal, 287 
identity, 283 
as independent entity, 281, 284 
inverse, 284 
non-singular, 284 
of operator, 281 
scalar, 286 


368 Subject Index 


Matrix, triangular, 295 
zero, 283 
Maximal element, 44 
Maximal ideal space, 319 
Maximum, 45 
Maximum modulus theorem, 311 
Meet, 46 
Metric, 51 
Metric space, 50, 51 
complete, 71 
completion, 84-85 
contraction in, 338 
sequentially compact, 121 
subspace of, 56 
totally bounded, 123 
Metrizable space, 93 
Minimum, 45 
Minkowski’s inequality, 88, 90, 218, 219 
Module, 191n. 
Moments of a function, 157 
Morera’s theorem, 160, 303 
Multiplicative functional, 321 
Mintz’s theorem, 157 


n-dimensional Euclidean space, 87 
n-dimensional unitary space, 90 
Naimark (or Neumark), M. A., ix 
(See also Gelfand-Neumark theorems) 
Natural imbedding, 232 
Neighborhood, 96 
Neumark (see Naimark) 
Niven, I., 43 
Norm(s), 54, 81, 212 
equivalent, 223 
metric induced by, 54, 81, 212 
uniform, 216 
Normal operator, 269 
Normal space, 133 
Normed linear space, 54, 81, 212 
conjugate space of, 224 
isometric isomorphism, 222 
locally compact, 224 
natural imbedding, 232 
reflexive, 232 
representation, 234 
second conjugate space of, 231 
strong topology, 232 
weak topology, 232 
weak* topology, 232-233 
Nowhere dense set, 74, 99 


Numbers, algebraic, 43 
cardinal, 31 
comparability theorem, 48 
finite, 32 
complex, 23, 52-54 
real, 21 
transcendental, 43 


One-to-one correspondence, 18 
One-point compactification, 163 
Open base, 99 
generated by open subbase, 101 
for point (at point), 96 
Open-closed set, 349 
Open cover, 111 
basic, 112 
subbasic, 112 
subcover of, 111 
Open mapping, 93 
Open mapping theorem, 211, 236 
Open rectangles, 101, 119 
Open set, 60, 91, 92 
basic, 99 
subbasic, 101 
Open sphere, 59 
Open strips, 101 
Open subbase, 101 
Operator(s), 222 
adjoint, 263 
characteristic equation, 288-289 
conjugate, 241 
determinant, 288 
eigenspace, 278 
eigenvalue, 278 
eigenvector, 278 
imaginary part, 271 
matrix, 281 
normal, 269 
projection(s), on Banach space, 237 
on Hilbert space, 274 
orthogonal, 276 
real part, 271 
reduced by subspace, 275 
ring, 303 
self-adjoint, 266 
ordering, 268 
positive, 268 
spectral resolution, 280, 291 
uniqueness, 291-293 
spectral theorem, 280, 290, 295-297 


Operator(s), spectrum, 289, 296 
square root, 294 
subspace invariant under, 275 
unitary, 272 
Order relation, partial, 7, 43 
on real line, 7 
total (or linear), 7, 44 
Origin, 54, 80, 191 
Orthogonal complement, 249 
Orthogonal dimension, 259-260 
Orthogonal vectors, 249 
Orthonormal basis, 293 
Orthonormal set, 251 
complete, 255 


Parallelogram law, 247 
Parseval’s equation, 256, 257 
Partial order relation, 7, 43 
Partially ordered set, 43-44 
Partition, 26 
Partition sets, 26 
Peano, G., 341-342 
Peano space, 342n. 
Perfect set, 99 
Permutation, 176 
Phillips, R. S., ix, 232n. 
Picard’s theorem, 339 
Plane, complex, 23, 52-54 
coordinate, 22 
Euclidean, 22, 87-88 
Point, boundary, 68, 97 
fixed, 338 
at infinity, 162, 163 
interior, 63, 97 
isolated, 96 
limit, 65, 96 
neighborhood of, 96 
in a@ space, 51, 92 
Pointwise convergence, 83 
Pointwise operations, 55, 82, 104 
Product of sets, 23-25 
Product space, 117 
Product topology, 116 
closed subbase, 117 
open base, 117 
open subbase, 116 
Projection, 25 
on Banach space, 237 
on Hilbert space, 274 
on linear space, 205-207 
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Proper value, 278n. 
Proper vector, 278n. 
Pseudo-metric, 58 
Pythagorean theorem, 249 


Quotient, algebra, 209 
ring, 187 
space, 193-194 


Radical, 314 
Real line, 21 
absolute value on, 52 
extended, 56 
least upper bound property, 21, 45 
usual metric on, 52 
Rectangle, closed, 101, 119 
open, 101, 119 
Reflexivity, of normed linear space, 232 
of relation, 27, 43 
Relation (binary), 26 
antisymmetric, 43 
circular, 31 
equivalence, 27 
partial order, 7, 43 
reflexive, 27, 43 
symmetric, 27 
transitive, 27, 43 
triangular, 31 
Relative topology, 93 
Representation, of algebra, 210 
of B*-algebra, 325-326 
of Banach algebra, 305 
of Banach space, 234 
of Boolean algebra, 353 
of Boolean ring, 351 
of commutative C*-algebra, 332-334 
of commutative semi-simple Banach 
algebra, 322 
of group, 181 
of Hilbert space, 260 
of linear space, 201-202 
of ring, 188-190 
Resolvent of an element, 309 
Resolvent equation, 309 
Resolvent set, 309 
Rickart, C. E., ix, 326n. 
Riemann, B., 49 
Riemann sphere, 163 
Riesz, F., 49, 226, 296n. 


370 = Subject Index 


Riesz representation theorem, 226 
Riesz-Fischer theorem, 257-258 
Ring, 181 
commutative, 183 
coset in, 186 
division, 184 
divisor of zero, 183 
elements in, invertible, 183 
non-singular, 183 
regular, 183 
singular, 183 
homomorphism, 188 
ideal in, 184 
with identity, 183 
of integers mod m, 182 
inverses, 183 
isomorphism, 188 
kernel, 188 
of operators, 303 
quotient, 187 
of sets, 14, 182 
subring of, 184 
Robbins, H., 94, 338 
Russell, B., 7 
Russell’s paradox, 6 


Scalars, 80, 191, 214 
Schauder’s fixed point theorem, 338 
Schroeder-Bernstein theorem, 29 
Schwartz, J. T., ix, 226, 232n. 
Schwarz’s inequality, 246 
Second conjugate space, 231 
Second countable space, 99-100 
Self-adjoint operator, 266 
Self-adjoint subalgebra of @(H), 303 
Semi-simple algebra, 316 
Separable space, 96 
Sequence, Cauchy, 71 

convergent, 50, 70, 132 

limit of, 50, 71, 132 
Sequentially compact metric space, 171 
Set(s), 4 

abnormal, 6 

Boolean algebra, 12, 344 

boundary, 68, 97 

boundary point, 68, 97 

bounded, 58 

Cantor, 67 

Cartesian product, 21 

closed, 65, 95 


Set(s), closure, 68, 96 
complement, 10 
contrasted with space, 5n. 
convex, 148 
countable, 34 
countably infinite, 34 
dense (everywhere dense), 70, 96 
derived, 96 
diagrams, 8 
diameter, 58 
difference of, 13 
disjoint, 9 
distance from point to, 58 
empty, 5 
equality, 5 
equivalence, 27 
finite, 5 
finite intersection, 12 
finite union, 12 
of first category, 75n. 
inclusion, 6 
index, 11 
infinite, 5 
interior of, 63, 97 
interior point, 63, 97 
intersection, 9, 11 
linearly ordered, 44 
neighborhood of, 96 
normal, 6 
nowhere dense, 74, 99 
numerical equivalence, 28, 32 
open, 60, 91, 92 

basic, 99 
subbasic, 101 
open-closed, 349 
orthonormal, 251 
complete, 255 
partially ordered, 43-44 
partition, 26 
perfect, 99 
product, 23-25 
proper subset, 6 
proper superset, 6 
ring, 14 
of second category, 75%. 
subset, 5 
superset, 5 
symmetric difference, 13 
totally ordered, 44 
uncountable, 36 
union, 8, 11 


Set (s), universal, 5, 7-8 
Set mappings, 18-19 
Sierpinski, W., 42n., 46n. 
Similarity for matrices, 286 
Smirnov, Y. M., 139n. 
Space, contrasted with set, 5n. 
Euclidean, 24, 87, 90, 214 
fixed point, 337-338 
of maximal ideals, 319 
metrizable, 93 
unitary, 24, 90, 214 
(See also Banach space; Hilbert space; 
Linear space; Metric space; 
Normed linear space) 
Space-filling curve(s), 341 
Hilbert’s, 341-342 
Spectral radius, 310 
formula, 312 
Spectral resolution, 280 
Spectral theorem, 280, 290 
generalized forms, 295-297, 334 
Spectrum, of element in Banach algebra, 
308 
of operator, 289, 296 
Sphere, closed, 66 
closed unit, 217, 232 
open, 59 
Stone, M. H., 141n., 153, 161n. 
(See also Banach-Stone theorem) 
Stone representation theorem, for Boolean 
algebras, 353 
for Boolean rings, 351 
Stone-Cech compactification, 141, 331 
Stone-Weierstrass theorem(s), complex, 
161 
extended, 166-167 
real, 160 
Strips, 101 
Strong topology, 232 
Strongest topology, 104 
Subbase, closed, 112 
open, 101 
Subcover, 111 
Supremum, 45 
Symmetry, 27, 51 
Sz.-Nagy, B., 226, 296n. 


Taylor, A. E., 215n., 239n., 248 
Tietze extension theorem, 136 
Topological divisor of zero, 307 
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Tovological space(s), 91, 92 
compact, 110, 111 
countably, 114 
locally, 120, 162 
compact subspace, 111 
completely regular, 133 
components, 146 
connected, 142, 143 
connected subspace, 143 
disconnected, 143 
disconnection, 143 
discrete, 93 
discrete two-point, 144 
first countable, 100n. 
Hausdorff, 130 
homeomorphic, 94 
locally connected, 151 
metrizable, 93 
normal, 133 
open base, 99 
open subbase, 101 
Peano, 342n. 
product, 117 
second countable, 99-100 
separable, 96 
subspace, 93 
T1-, 130 
totally disconnected, 149 
Topology, 92 
as branch of mathematics, 94 
discrete, 93 
generated by given class of sets, 102 
open base, 99 
open subbase, 101 
product, 116-117 
relative, 93 
strong, on normed linear space, 232 
strongest, 104 
usual, 92 
weak, generated by set of mappings, 105 
on normed linear space, 232 
weak*, on conjugate space, 232-233 
weak operator, 303 
weakest, 104 
Total matrix algebra, 284 
Totally bounded metric space, 123 
Totally disconnected space, 149 
Totally ordered set, 44 
Transitivity, 27, 43 
Triangle inequality, 51 
Tychonofi’s theorem, 119 
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Uniform boundedness theorem, 211, 
239-240 
Uniform continuity, 77 
Uniform convergence, 83 
Uniform norm, 216 
Uniformly bounded functions, 128n. 
Uniformly convex Banach space, 248 
Unit circle, 5 
Unit disc, closed, 5 
open, 5 
Unitary operator, 272 
Unitary space, infinite-dimensional, 90 
n-dimensional, 24, 90, 214 
Universal set, 5, 7-8 
Upper bound, least, 45 
Urysohn’s imbedding theorem, 138 
Urysohn’s lemma, 135 


Usual topology on metric space, 92 


Vector space (see Linear space) 
Vectors, 81, 86-87, 191 
characteristic, 2787. 
eigen-, 278 
linearly dependent, 196-197 
linearly independent, 196-197 
orthogonal, 249 


Vectors, proper, 278n. 
von Neumann algebra, 303 


W*-algebra, 303 
Wallman, H., 150 
Weak operator topology, 303 
Weak topology, generated by set of map- 
pings, 105 
on normed linear space, 232 
Weak* topology on conjugate space, 
232-233 
Weakest topology, 104 
Weierstrass, K., 49 
(See also Bolzano-Weierstrass; Stone- 
Weierstrass) 
Weierstrass approximation theorem, 154, 
161 
Weierstrass intermediate value theorem, 
142, 144 
Wilder, R. L., 6, 397., 46n., 343 


Zaanen, A. C., 215n. 
Zero, 54, 80, 179 
divisor, 183 
topological, 307 
Zero space, 192 
Zorn’s lemma, 45-46 
Zygmund, A., 240 


